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PACKAGINC CELL LINES FOR H3V-DERIVED RETROVIRAL VECTOR PARTICLES 

BACKGROUND OF THE INVENTION 

Retroviral vectors based on lentiviruses, such as human immunodeficiency 
viruses (HIV), can infect nondividing cells, and integration of proviral DN A occurs 
without the need for cell division. These properties make lentiviruses attractive for gene 
transfer into nondividing cells, such as hepatocytes, myofibers, hematopoietic stem 
cells, and neurons. 

However, the use of lentivirus vectors, particularly HTV vectors, particularly for 
gene therapy, is hampered by concern over their safety. Thus, a need for the 
development of lentivirus vectors, particularly HIV vectors, with improved safety, 
particularly for gene therapy, exists. 

SUMMARY OF THE INVENTION 

The present invention relates to novel packaging cell lines useful for generating 
viral accessory protein independent lentivirus-derived, particularly HTV-derived, 
retroviral vector particles, to construction of such cell lines and to methods of using the 
accessory protein independent lentivirus-derived retroviral vector particles to introduce 
DNA of interest into cells (e.g, eukaryotic cells such as animal (particularly 
mammalian), plant or yeast cells or prokaryotic cells such as bacterial cells). In a 
preferred embodiment, the packaging cell lines of the present invention are stable 
packaging cell lines. 

In one embodiment of the invention, packaging cell lines for producing a viral 
accessory protein independent lentivirus-derived retroviral vector particles comprise (a) 
a cell (e.g., rnarnmalian cell); and (b) a retroviral nucleotide sequence in the cell which 
comprises a coding sequence for lentivirus gagpoL wherein said coding sequence has 
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been codon optimized by mutagenisis to improve expression of the lentivirus gagpol 
proteins. 

In second embodiment of the invention, packaging cell lines for producing a 
viral accessory protein independent lentivirus-derived retroviral vector particles ~" 
comprise (a) a cell (e.g., mammalian cell); (b) a first retroviral nucleotide sequence in 
the cell which comprises a coding sequence for lentivirus gagpol, wherein said coding 
sequence has been codon optimized by mutagenisis to improve expression of the 
lentivirus gagpol proteins; and (c) a second retroviral nucleotide sequence in the cell 
which comprises the coding sequence for a heterologous envelope protein. 

In a third embodiment of the invention, packaging cell lines for producing a viral 
accessory protein independent lentivirus-derived retroviral vector particles comprise (a) 
a cell (e.g., mammalian cell); (b) a first retroviral nucleotide sequence in the cell which 
comprises a coding sequence for lentivirus gagpol, wherein said coding sequence has 
been codon optimized by mutagenisis to improve expression of the lentivirus gagpol 
proteins; (c) a second retroviral nucleotide sequence in the cell which comprises the 
coding sequence for a heterologous envelope protein; and (d) a third retroviral 
nucleotide sequence which comprises a DNA sequence of interest and lentivirus cis- 
acting sequences required for packaging, reverse transcription and integration. 

In a fourth embodiment of the invention, packaging cell lines for producing a 
viral accessory protein independent lentivirus-derived retroviral vector particles 
comprise (a) a cell (e.g., mammalian cell); (b) a retroviral nucleotide sequence in the 
cell which comprises a coding sequence for lentivirus gagpol, wherein said coding 
sequence has been codon optimized by mutagenisis to improve expression of the 
lentivirus gagpol proteins; and (c) a retroviral nucleotide sequence which comprises a 
DNA sequence of interest and lentivirus cis-acting sequences required for packaging, 
reverse transcription and integration. 

In a fifth embodiment of the invention, packaging cell lines for producing a viral 
accessory protein independent HTV-derived retroviral vector particles comprise (a) a cell 
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(e.g., mammalian cell); and (b) a retroviral nucleotide sequence in the cell which 
comprises a coding sequence for HIV gagpol, wherein said coding sequence has been 
codon optimized by mutagenisis to improve expression of the HIV gagpol proteins. 

In sixth embodiment of the invention, packaging cell lines for producing a viral 
5 accessory protein independent HIV-derived retroviral vector particles comprise (a) a cell 
(e.g., mammalian cell); (b) a first retroviral nucleotide sequence in the cell which 
comprises a coding sequence for HIV gagpol, wherein said coding sequence has been 
codon optimized by mutagenisis to improve expression of the HTV gagpol proteins; and 
(c) a second retroviral nucleotide sequence in the cell which comprises the coding 
10 sequence for a heterologous envelope protein. 

In a seventh embodiment of the invention, packaging cell lines for producing a 
viral accessory protein independent HIV-derived retroviral vector particles comprise (a) 
a cell (e.g., mammalian cell); (b) a first retroviral nucleotide sequence in the ceil which 
comprises a coding sequence for HIV gagpol, wherein said coding sequence has been 
1 5 codon optimized by mutagenisis to improve expression of the HIV gagpol proteins; (c) 
a second retroviral nucleotide sequence in the cell which comprises the coding sequence 
for a heterologous envelope protein; and (d) a third retroviral nucleotide sequence which 
comprises a DNA sequence of interest and HTV cis-acting sequences required for 
packaging, reverse transcription and integration, 
20 In a eighth embodiment of the invention, packaging cell lines for producing a 

viral accessory protein independent HIV-derived retroviral vector particles comprise (a) 
a cell (e.g., mammalian cell); (b) a retroviral nucleotide sequence in the cell which 
comprises a coding sequence for HIV gagpol, wherein said coding sequence has been 
codon optimized by mutagenisis to improve expression of the HTV gagpol proteins; and 
25 (c) a retroviral nucleotide sequence which comprises a DNA sequence of interest and 
HTV cis-acting sequences required for packaging, reverse transcription and integration. 

Alternatively, each of the packaging cell lines described herein can be produced 
using (1) a retroviral nucleotide sequence which comprises a codon optimized gag 
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coding sequence and (2) a retroviral nucleotide sequence which comprises a codon 
10 optimized pol coding sequence, in place of the retroviral nucleotide sequence which 

comprises a codon optimized gagpol coding sequence. 

In a particular embodiment, the heterologous envelope protein is the G 
5 glycoprotein of vesicular stomatitis virus (VSV G). In another embodiment, the 

15 

heterologous envelope protein is the amphotropic envelope of the Moloney leukemia 
virus (MLV). 

Cell lines for producing a viral accessory protein independent lenti virus-derived 
20 retroviral vector particles are produced by transfecring host cells (e.g.. mammalian host 

1 0 cells) with a plasmid comprising a DNA sequence which encodes lentivirus gagpol 
proteins, wherein said DNA sequence has been codon optimized by mutagenisis to 
improve expression of the lentivirus gagpol proteins. Depending upon the particular 
cell line being produced, the host cells are also co-transfected with a plasmid 
comprising a DNA sequence which encodes a heterologous envelope protein, or a 
1 5 plasmid comprising a DNA sequence of interest and lentivirus cis-acting sequences 
30 required for packaging, reverse transcription and integration, or both of these plasmids. 

Alternatively, host cells are transfected with a plasmid comprising a codon optimized 
DNA sequence encoding a lentivirus gag protein and a plasmid comprising a codon 
optimized DNA sequence encoding a lentivirus pol protein, in place of the plasmid 

35 

20 comprising a codon optimized DNA sequence encoding both lentivirus gagpol proteins. 
CeD lines for producing a viral accessory protein independent HIV-derived 
retroviral vector particles are produced by co-transfecting host cells (e.g., mammalian 
40 host cells) with a plasmid comprising a DNA sequence which encodes HIV gagpol 

proteins, wherein said DNA sequence has been codon optimized by mutagenisis to 
25 improve expression of the HIV gagpol proteins. Depending upon the particular cell line 
45 being produced, the host cells are also co-transfectcd with a plasmid comprising a DNA 

sequence which encodes a heterologous envelope protein, or a plasmid comprising a 
DNA sequence of interest and HIV cis-acting sequences required for packaging, reverse 
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transcription and integration, or both of these plasmids. Alternatively, host cells are 
transfected with a plasmid comprising a codon optimized DN A sequence encoding a 
HTV gag protein and a plasmid comprising a codon optimized DNA sequence encoding 
a HIV pol protein, in place of the plasmid comprising a codon optimized DNA sequence 
5 encoding both HIV gagpol proteins. 

The present invention also relates to methods of producing viral accessory 
protein independent lentivirus-derived retroviral vector particles, comprising co- 
transfecting host cells (e.g., mammalian host cells) with (a) a first plasmid comprising a 
DNA sequence which encodes lentivirus gagpol proteins, wherein said DNA sequence 
1 0 has been codon optimized by mutagenics to improve expression of the lentivirus gagpol 
proteins; (b) a second plasmid comprising a DNA sequence which encodes a 
heterologous envelope protein; and (c) a third plasmid comprising a DNA sequence of 
interest and lentivirus cis-acting sequences required for packaging, reverse transcription 
and integration. Alternatively, host cells are transfected with a plasmid comprising a 
15 codon optimized DNA sequence encoding a lentivirus gag protein and a plasmid 
30 comprising a codon optimized DNA sequence encoding a lentivirus pol protein, in place 

of the first plasmid comprising a codon optimized DNA sequence encoding both 
lentivirus gagpol proteins. 

In a particular embodiment the invention relates to methods of producing viral 
20 accessory protein independent HTV-derived retroviral vector particles, comprising co- 
transfecting host cells (e.g., iriammalian host cells) with (a) a first plasmid comprising a 
DNA sequence which encodes HIV gagpol proteins, wherein said DNA sequence has 
40 been codon optimized by mutagenisis to improve expression of the HTV gagpol 

proteins; (b) a second plasmid comprising a DNA sequence which encodes a 
25 heterologous envelope protein; and (c) a third plasmid comprising a DNA sequence of 
interest and HIV cis-acting sequences required for packaging, reverse transcription and 

45 

integration. Alternatively, host cells are transfected with a plasmid comprising a codon 
optimized DNA sequence encoding a HIV gag protein and a plasmid comprising a 
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codon optimized DNA sequence encoding a HIV po! protein, in place of the first 
piasmid comprising a codon optimized DNA sequence encoding both HIV gagpol 
proteins. 

The present invention also relates to viral accessory protein-independent - 
5 retroviral particles produced by or obtainable by (obtained by) the methods described 
herein. 

The present invention further relates to isolated DNA encoding a codon 
optimized lentivirus gagpol, isolated DNA encoding the gag coding region of a codon 
20 optimized lentivirus gagpol, and isolated DNA encoding the pol coding region of a 

10 codon optimized lentivirus gagpol In a particular embodiment the present invention 
relates to isolated DNA encoding a codon optimized HIV gagpol isolated DNA 
encoding the gag coding region of a codon optimized HIV gagpol and isolated DNA 
encoding the pol coding region of a codon optimized HIV gagpol 

The packaging cell lines and viral particles of the present invention can be used 
1 5 for gene therapy or gene replacement with improved safety. The packaging cell lines 
30 and viral particles of the present invention can also be used in development and 

production of vaccines, and in production of biochemical reagents. Gene therapy 
vectors produced with the cell lines of the present invention are expected to be valuable 
35 medical therapeutics. 

20 BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 is a schematic diagram of an expression cassette containing the codon 
40 optimized gagpol genes. The DNA was constructed in multiple segments, which are 

indicated at the top as 1/3, 2/3, 3/3 (A, B, C and D) and HIN. Restriction sites used to 
assemble the cloned segments are indicated above the kilobasepair (Kb) ruler. Below 
45 25 the ruler are multiple features showing me location of the human cytomegalovirus 

(CMV) promoter, human betaglobin sequences (Bglobin), mRNA sequences (thinner 
line represents intronic sequence), the gag and pol open reading frames, the individual 
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proteolytic fragment coding sequences (pl7_MA, p24_CA, p7, p6, PR, p51_RT, 
RNaseH and integrase (IN)) and each synthetic oligonucleotide used in the assembly 
process (multiple adjacent open arrows). 

Figure 2 is a table which depicts codon usage frequencies in genes which are *~ 
5 highly expressed and in the codon optimized gagpol open reading frame of the HIV 
packaging construct described herein. 

Figure 3 is a schematic representation of the HIV pro virus and a three-plasmid 
expression system used for generating a pseudotyped HIV-based vector by transient 
transfection as described in Naldini ei al. y Science, 272:263-267 (1996). 
1 0 Figure 4 is a list of some characteristics relating to the HIV Rev protein. 

Figure 5 is a list of some points relating to codon optimization of HIV gagpol. 
Figure 6 is a partial DNA sequence of HIV gag (SEQ ID NO: I), showing 
inactivation of inhibitory sequences as described in Schwartz, S. ei ah, J. Virol. , 
<ft(12):7176-7182(1992). 
15 Figure 7 a plot of the %(G+C) content of wildtype HIV gagpol sequences and 

theoretically codon optimized HIV gagpol sequences. The percent of bases, either G or 
C, was calculated for a 30 nucleotide moving window for the entire length of the gagpol 
gene, and the value plotted versus nucleotide position. Diamonds = HTV gagpol 
sequences; squares = full optimal back-translation for gag open reading frame; 
20 triangles = full optimal back-translation forpo/ open reading frame; CO = codon 
optimized. 

Figures 8 A-8E depict the alignment of the nucleotide sequences and predicted 
amino acid sequences for the gag coding region of a wildtype HIV gagpol and a codon 
optimized HIV gagpol "NL4-3 genbank.SEQ n indicates the nucleotide sequence (SEQ 
25 ID NO:2) and predicted amino acid sequence (SEQ ID NO:3) for the gag coding region 
of a wildtype HIV gagpol M pHDMHgpm2.seq" indicates the nucleotide sequence (SEQ 
ID NO:4) and predicted amino acid sequence (SEQ ID NO:5) for the gag coding region 
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of a codon optimized HIV gagpol The "NL4-3 genbank.SEQ" sequences are publicly 
available at the NUi GenBank sequence repository (Accession No. M19921). 

Figures 9A-9L depict the alignment of the nucleotide sequences and predicted 
amino acid sequences for the pol coding region of a wildtype HIV gagpol and a codorr 
optimized HIV gagpol "NL4-3 genbank.SEQ" indicates a nucleotide sequence (SEQ 
ID NO:6) and a predicted amino acid sequence (SEQ ID NO:7) for thepo/ coding 
region of a wildtype HTV gagpol available in the NIH GenBank sequence repository 
(Accesssion No. M19921). The nucleotide and amino acid sequences for ihepol coding 
region available in the GenBank sequence repository contain two sequence errors, 
which are indicated in Figures 9A-9L with shading. "pNL4-3.seq n indicates the correct 
nucleotide sequence (SEQ ID NO:8) and predicted amino acid sequence (SEQ ID 
NO:9) for the pol coding region of a wildtype HIV gagpol "pHDMHgpm2.seq n 
indicates the nucleotide sequence (SEQ ID NO: 1 0) and predicted amino acid sequence 
(SEQ ID NO: 11) for the pol coding region of a codon optimized HIV gagpol 

Figures 10A-10D depict the DNA sequence (SEQ ID NO: 12) for pHDMHgpm2. 
The CMV enhancer/promoter is at nucleotides 97 to 679, human betaglobin sequences 
(Bglobin) are at nucleotides 761 to 864, 865 to 1303 and 5710 to 6469 (end of Bglobin 
is at nucieotdes 6445 to 6469), mRNA sequences are at nucleotides 680 to 778 and 1255 
to 5921, SV40 origin of replication is at nucleotides 8796 to 8908, beta-lactamasc (bla) 
coding region is at nucleotides 6709 to 7569, intron sequences are at nucleotides 779 to 
1254, the codon optimized gag coding region is at nucleotides 1318 to 2820, the codon 
optimized pol coding region is at nucleotides 261 9 to 5624 and the poly A site is at 
nucleotides 5897 to 5921. 

Figure 1 1 is a circular map of plasmid pHDMHgpm2. 

DETAILED DESCRIPTION OF THE INVENTION 

The present invention relates to novel packaging cell lines useful for generating 
viral accessory protein independent lentivirus-derived, particularly HTV-derived, 
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retroviral vector particles, to construction of such cell lines and to methods of using the 
accessory protein independent lentivirus-derived retroviral vector particles to introduce 
DNA of interest into cells (e.g, eukaryotic cells such as animal (particularly 
mammalian), plant or yeast cells or prokaryotic cells such as bacterial cells). In a — 
5 particular embodiment, the packaging cell lines of the present invention are stable 
packaging cell lines. 

The cell lines are engineered to express the lentivirus proteins necessary for 
virus particle formation (gagpol proteins), without containing DNA sequences from 
20 lentivirus accessory proteins (tat vif, vpr, vpu, nef and rev proteins and Rev response 

10 element (RRE)). Additionally, no viral sequences (such as cis-acting elements termed 
constitutive transport elements (CTEs)) will be expressed as RNA of any kind. DNA 
sequences for lentivirus gagpol are codon optimized by extensively mutagenizing the 

25 

sequences to improve expression and to reduce the risk of recombination between 
transfer vector sequences and gagpol messenger RNA. This greatly improves the safety 
1 5 of virus preparations generated from these cell lines. In a particular embodiment, the 
30 DNA sequences for lentivirus gagpol are not codon optimized in the overlap region 

between the gag and pol sequences and in cis-acting signals necessary for translation of 
pol. 

Examples of lenti viruses include human immunodeficiency viruses (e.g.. HIV-L 
20 HIV-2, H1V-3), bovine lentiviruses (e.g., bovine immunodeficiency viruses, bovine 
irnmunodeficiency-like viruses, Jembrana disease viruses), equine lentiviruses (e.g., 
equine infectious anemia viruses), feline lentiviruses (e.g., feline immunodeficiency 
40 viruses, panther lentiviruses, puma lentiviruses), ovine/caprine lentiviruses (e.g., 

Brazilian caprine lentiviruses, caprine arthritis-encephalitis viruses, Maedi-Visna 
25 viruses. Maedi-Visna-like viruses, Maedi-Visna-related viruses, ovine lentiviruses, 
Visna lentiviruses), Simian AIDS retroviruses (e.g., human T-cell lymphotropic virus 
type 4), simian immunodeficiency viruses, simian-human immunodeficiency viruses, 
human lymphotrophic viruses (e.g., type III), simian T-cell lymphotrophic viruses. 
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In another embodiment, cell lines are engineered to express the HIV proteins 
necessary for virus particle formation (gagpol proteins), without containing DNA 
sequences from HIV accessory proteins (tat, vif, vpr, vpu, nef and rev proteins and Rev 
response element (RRE)). Additionally, no viral sequences (such as cis-acting elements 
termed constitutive transport elements (CTEs)) will be expressed as RNA of any kind. 
DNA sequences for a HTV gagpol are codon optimized by mutagenesis to improve 
expression and to reduce the risk of recombination between transfer vector sequences 
and gagpol messenger RNA. In a particular embodiment, the DNA sequences for HTV 
gagpol are not codon optimized in the overlap region between the gag and pal 
sequences and in cis-acting signals necessary for translation of pol. 

Alternatively, each of the packaging cell lines described herein can be produced 
using (1 ) a nucleotide sequence which comprises a codon optimized gag coding 
sequence and (2) a nucleotide sequence which comprises a codon optimized pol coding 
sequence, in place of the nucleotide sequence which comprises a codon optimized 
gagpol coding sequence. In this embodiment, the gag and pol coding sequences can be 
completely codon optimized 

Benefits of the present invention include the removal of potentially harmful 
lentivirus accessory proteins and other viral sequences, and the reduction of the risk of 
recombination to produce replication competent virus. 

Packaging cell lines for producing a viral accessory protein independent 
lentivirus-derived retroviral vector particles comprise a mammalian cell and a retroviral 
nucleotide sequence comprising a coding sequence for a lentivirus gagpol which has 
been codon optimized. In a particular embodiment the packaging cell lines further 
comprise a retroviraJ nucleotide sequence comprising a coding sequence for a 
heterologous envelope protein. In a second embodiment, the packaging cell lines 
further comprise a retroviral nucleotide sequence comprising a coding sequence for a 
heterologous envelope protein and a retroviral nucleotide sequence which comprises a 
DNA sequence of interest and HIV cis-acting sequences required for packaging, reverse 
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transcription and integration. In third embodiment, the packaging cell lines further 
comprise a retroviral nucleotide sequence which comprises a DNA sequence of interest 
and HIV cis-acting sequences required for packaging, reverse transcription and 
integration. Alternatively, the packaging cell lines of the present invention comprise "a 
5 retroviral nucleotide sequence which comprises a codon optimized gag coding sequence 
and (2) a retroviral nucleotide sequence which comprises a codon optimized pol coding 
sequence, in place of the retroviral nucleotide sequence which comprises a codon 
optimized gagpol coding sequence. 
20 The coding sequence(s) for lentivirus gagpol which has (have) been codon 

10 optimized results in improved expression of the lentivirus gagpol proteins and reduces 
the risk of recombination between the transfer vector and gagpol messenger RNA. 
Codon optimization of the coding sequenced) for lentivirus gagpol was obtained by 
mutagenizing for each particular amino acid residue, specific nucleic acid bases in a 
codon for the particular amino acid residue to a nucleic acid base which is present in a 
1 5 codon which occurs at a high frequency in genes which are highly expressed for the 
30 same amino acid residue. In a particular embodiment, the resulting optimized codon 

also does not cause introduction of mRNA splicing signals into the codon optimized 
sequence. Thus, in a particular embodiment, codon optimization of the coding 
sequences) for lentivirus gagpol is obtained by mutagenizing for each particular amino 
20 acid residue, specific nucleic acid bases in a codon for the particular amino acid residue 
to a nucleic acid base that is present in a codon which (1) occurs at a high frequency in 
genes which are highly expressed for the same amino acid residue and (2) does not 
40 cause introduction of mRNA splicing signals into the codon optimized sequence. 

Codon optimization typically results in the removal of nucleic acid base A-rich 
25 instability elements. 

In a particular embodiment, the coding sequence for a HIV gagpol (pNL4-3; 
available through the AIDS repository, NIH; Adachi et aL t J. Virol, 59:284-291 (1986)) 
has been codon optimized to improve translational efficiency of the HIV gagpol 
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proteins and reduce the risk of recombination between the transfer vector and HIV 
gagpol messenger RNA. Two hundred thirty-seven base pairs (237 bp) consisting of the 
gag pol overlap and cis-acting signals necessary for translation of pol (nucleotides 2583 
to 2819 of SEQ ID NO: 12) were not optimized. The HIV gagpol sequence obtained^ 
5 using the codon optimization process does not differ at the amino acid level from the 
wildtype HTV gagpol sequence, but differs at the nucleotide level from the HIV gagpol 
sequence. A codon optimized HIV gag sequence is shown in Figures 8A-8E 
(pHDMHgpm2.seq) (SEQ ID NO:4). A codon optimized HIV pol sequence is shown in 
20 Figures 9A-9L (pHDMHgpm2.scq) (SEQ ID NO: 1 0). 

10 A plasmid comprising DNA sequences which encode codon opumized ientivirus 

gagpol proteins is also referred to herein as a packaging construct This plasmid 
includes a promoter which drives the expression of the gagpol proteins, such as the 
human cytomegalovirus (hCMV) immediate early promoter. This plasmid is defective 
for the production of the viral envelope and accessory proteins tat, vif, vpr, vpu t nef and 
15 rev and the Rev response element (RRE). The packaging construct also does not 
30 contain viral sequences which are transcribed into mRNA, such as constitutive transport 

elements (CTEs). 

A packaging construct comprising a codon optimized HIV gagpol is depicted in 
Figure 1 and in Figure 11. Figures 10A-10D depict the DNA sequence (SEQ ID 
20 NO: 1 2) for the packaging construct pHDMHgpm2. This packaging construct 
(pHDMHgpm2) was constructed as follows: Plasmid pMDA.HTVgp mam was 
generated by chemical synthesis and PCR assembly (which is described in, for example, 
40 Stemmer et ai, Gene, /6V:49-53 (1995)) of 215 different oligonucleotides. The DNA 

sequence for pMDA.HTVgp mam is the same as the DNA sequence for 
25 pMDA.HIVgp jtg except for 4.3 kb which was codon optimized using the DNAStar 
program (LaserGene, Madison, WI). Two hundred thirty-seven base pairs (237 bp) 
consisting of the gag pol overlap and cis-acting signals necessary for translation of poi 
(nucleotides 2583 to 2819 of SEQ ID NO: 12) were not optimized due to dual reading 
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frame constraints. A Nsil site 5' of IN was preserved to aid fusion with wildtype 
sequences. Several single or double base pair silent mutations were introduced either to 
prevent potential splice donors and acceptors, or by the synthesis process. 
pMDA.HIVgp jtg was derived from HIV- 1 strain NL4-3. The protease mutation maris 
present in the NL4-3 NIH GenBank sequence was then repaired (Figure 9B), changing 
the nucleotide present at position 2948 of SEQ ID NO: 12 from a to a "C\ thereby 
producing the codon present at nucleotide positions 2948 to 2950 of SEQ ID NO: 12 
which encodes an arginine instead of the glycine present in the NL4-3 GenBank amino 
acid sequence. The resulting plasmid was named pMDHgpmam. The EcoRI-Hindlll 
fragment of pMDHgpmam was inserted into pHDM2b, a high copy version of the pMD 
vector(Or>\ D.etaL,Proc. Natl Acad. Set USA, 93(21): 11 400- 11 406 (1996)), to 
produce plasmid pHDMHgpm. The sequencing mutation that is present in the RNase 
domain of the NL4-3 NIH GenBank sequence was repaired (Figure 9H), changing the 
codon present at nucleotide positions 4724 to 4726 of SEQ ID NO:12 from "GGG" to 
" AAG", thereby producing a codon encoding a lysine instead of the glycine present in 
the NL4-3 GenBank amino acid sequence. The resulting plasmid was named 
pHDMHgpm2. Codon usage frequencies in the codon optimized gagpol open reading 
frame of the packaging construct pHDMHgpm2 are shown in Figure 2. 

As used herein, a heterologous envelope protein permits pseudotyping of 
particles generated by the packaging construct and includes the G glycoprotein of 
vesicular stomatitis virus (VSV G) and the amphotropic envelope of the Moloney 
leukemia virus (MLV). A plasmid comprising a DNA sequence which encodes a 
heterologous envelope protein is also referred to herein as an envelope coding plasmid. 

The terms "mammal" and "mammalian", as used herein, refer to any vertebrate 
animal, including monotremes, marsupials and placental, that suckle their young and 
either give birth to living young (eutharian or placental mammals) or are egg-laying 
(metatharian or nonplacental mammals). Examples of mammalian species include 
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humans and other primates (e.g., monkeys, chimpanzees), rodents (e.g., rats, mice, 
guinea pigs) and mminents (e.g., cows, pigs, horses). 

Examples of mammalian cells include human (such as HeLa ceils, 293T cells, 
NIH 3T3 cells), bovine, ovine, porcine, murine (such as embryonic stern cells), rabbif* 
and monkey (such as COS1 cells) cells. The cell may be a non-dividing cell (including 
hepatocytes, myofibers, hematopoietic stem cells, neurons) or a dividing cell. The cell 
may be an embryonic cell, bone marrow stem cell or other progenitor cell. Where the 
cell is a somatic cell, the cell can be. for example, an epithelial cell, fibroblast, smooth 
muscle cell, blood cell (including a hematopoietic cell, red blood cell. T-celL B-cell, 
etc.), tumor cell, cardiac muscle cell, macrophage, dendritic cell, neuronal cell (e.g., a 
glial cell or astrocyte), or pathogen-infected cell (e.g.. those infected by bacteria, 
viruses, virusoids, parasites, or prions). 

Typically, cells isolated from a specific tissue (such as epithelium, fibroblast or 
hematopoietic cells) are categorized as a "cell-type." The cells can be obtained 
commercially or from a depository or obtained directly from an animal, such as by 
biopsy. Alternatively, the cell need not be isolated at all from the animal where, for 
example, it is desirable to deliver the virus to the animal in gene therapy. 

To produce the cell lines of the present invention for producing a viral accessory 
protein independent lentivirus-derived retroviral vector particles, mammalian host cells 
are co-transfected with (a) a first plasmid comprising DNA sequence which encode 
lentivirus gagpol proteins, wherein said DNA sequence has been codon optimized by 
mutagenisis, as described above, to improve expression of the lentivirus gagpol 
proteins; and (2) a second plasmid comprising a DNA sequence which encodes a 
heterologous envelope protein, or a retroviral nucleotide sequence which comprises a 
DNA sequence of interest and lentivirus cis-acting sequences required for packaging, 
reverse transcription and integration, or both, under conditions appropriate for 
transfection of the cells. 
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In a particular embodiment, to produce the cell lines of the present invention for 
producing viral accessory protein independent HI V-derived retroviral vector particles 
mammalian host cells were cotransfected with (a) a first plasmid comprising DNA 
sequence which encode HIV gagpol proteins, wherein said DNA sequence has been ~~ 
5 codon optimized by mutagenisis, as described above, to improve expression of the HIV 
gagpol proteins; and (2) a second plasmid comprising a DNA sequence which encodes a 
heterologous envelope protein, or a retroviral nucleotide sequence which comprises a 
DNA sequence of interest and HIV cis-acting sequences required for packaging, reverse 
20 transcription and integration, or both, under conditions appropriate for transfection of 

10 the cells. 

Virus stocks consisting of viral accessory protein independent lentivirus-derived, 
particularly HlV-derived, retroviral vector particles of the present invention are 
produced by maintaining the transfected cells under conditions suitable for virus 
production (e.g., in an appropriate growth media and for an appropriate period of time). 
1 5 Such conditions, which are not critical to the invention, are generally known in the art. 
30 See ^ e g j Sambrook et ai, Molecular Cloning: A Laboratory Manual, Second Edition, 

Cold Spring Harbor University Press, New York (1989); Ausubel et al y Current 
Protocols in Molecular Biology, John Wiley & Sons, New York (1998): U.S. Patent 
No 5,449.614; and U.S. Patent No. 5,460,959, the teachings of which are incorporated 

35 

20 herein by reference. 

To generate viral accessory protein independent lentivirus-derived retroviral 
vector particles, mammalian host cells can be co-transfected with (a) a first plasmid 
40 comprising DNA sequence which encode lentivirus gagpol proteins, wherein said DNA 

sequence has been codon optimized by mutagenisis, as described above, to improve 
25 expression of the lentivirus gagpol proteins; (b) a second plasmid comprising a DNA 
sequence which encodes a heterologous envelope protein; and (c) a third plasmid 
comprising a DNA sequence of interest and lentivirus cis-acting sequences required for 
packaging, reverse transcription and integration. Alternatively, mammalian cells are 
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transfected with a plasmid comprising a codon optimized DNA sequence encoding a 
lenti virus gag protein and a plasmid comprising a codon optimized DNA sequence 
encoding a lentivirus pol protein, in place of the first plasmid comprising a codon 
optimized DNA sequence encoding both lentivirus gagpol proteins. Alternatively, ~" 
5 mammalian host cells are transfected with a plasmid comprising a codon optimized 
DNA sequence encoding a lentivirus gag protein and a plasmid comprising a codon 
optimized DNA sequence encoding a lentivirus pol protein, in place of the first plasmid 
comprising a codon optimized DNA sequence encoding both lentivirus gagpol proteins. 
In a particular embodiment, the invention relates to methods of producing viral 
1 0 accessory protein independent HIV-derivcd retrovirai vector particles, comprising co- 
transfecting mammalian host cells with (a) a first plasmid comprising DNA sequence 
which encode HIV gagpol proteins, wherein said DNA sequence has been codon 
optimized by mutagenisis, as described above, to improve expression of the HIV gagpol 
proteins; (b) a second plasmid containing a DNA sequence which encodes a 
1 5 heterologous envelope protein: and (c) a third plasmid comprising a DNA sequence of 
30 interest and HIV cis-acting sequences required for packaging, reverse transcription and 

integration. Alternatively, marnrrialian host cells are transfected with a plasmid 
comprising a codon optimized DNA sequence encoding a HTV gag protein and a 
plasmid comprising a codon optimized DNA sequence encoding a HTV pol protein, in 
20 place of the first plasmid comprising a codon optimized DNA sequence encoding both 
HIV gagpol proteins. 

Virus particles produced by the methods described herein, using a codon 
optimized HIV packaging construct produced as described herein, were compared by 
Western analysis with virus particles produced as described in Naldini et aL Science, 
25 272:263-267 (1 996), using the packaging construct plasmid pCMV AR8.2. Both the 
immunological reactivity and the proteolytic processing were confirmed to be 
indistinguishable. 
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A plasmid comprising a DNA sequence of interest and HIV cis-acting sequences 
required for packaging, reverse transcription and integration is also referred to herein as 
a transfer vector. A transfer vector, as used herein, refers to a vehicle which is used to 
introduce a DNA of interest into a eurkaryotic cell, particularly a mammalian cell. — 
Figure 3 depicts an example of a transfer vector. 

DNA sequence of interest, as used herein, include all or a portion of a gene or 
genes encoding a nucleic acid product whose expression in a cell or a mammal is 
desired. In a particular embodiment, the nucleic acid product is a heterologous 
therapeutic protein. Examples of therapeutic proteins include antigens or immunogens, 
such as a polyvalent vaccine, cytokines, tumor necrosis factor, interferons, imerleukins, 
adenosine deaminase, insulin, T-cell receptors, soluble CD4, growth factors, such as 
epidermal growth factor, human growth factor, insulin-like growth factors, fibroblast 
growth factors), blood factors, such as Factor VIII, Factor IX, cytochrome b, 
glucocerebrosidase, ApoE, ApoC, ApoAI, the LDL receptor, negative selection markers 
or "suicide proteins", such as thymidine kinase (including the HSV, CMV, VZV TK), 
anti-angiogenic factors, Fc receptors, plasminogen activators, such as t-PA, u-PA and 
streptokinase, dopamine, MHC, tumor suppressor genes such as p53 and Rb, 
monoclonal antibodies or antigen binding fragments thereof, drug resistance genes, ion 
channels, such as a calcium channel or a potassium channel, adrenergic receptors, 
hormones (including growth hormones) and anti-cancer agents. In another embodiment, 
the nucleic acid product is a gene product to be expressed in a cell or a mammal and 
which product is otherwise defective or absent in the cell or mammal. For example, the 
nucleic acid product can be a functional gene(s) which is defective or absent in the cell 
or mammal. 

DNA sequence of interest includes DNA sequences (control sequences) which 
are necessary to drive the expression of the gene or genes. The control sequences are 
operably linked to the gene. The term "operably linked*, as used herein, is defined to 
mean that the gene is linked to control sequences in a manner which allows expression 
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of the gene (or the nucleic acid sequence). Generally, operably linked means 
contiguous. 

Control sequences include a transcriptional promoter, an optional operator 
sequence to control transcription, a sequence encoding suitable mRNA ribosomai ~~ 
binding sites and sequences which control termination of transcription and translation. 
In a particular embodiment, a recombinant gene encoding a desired nucleic acid product 
can be placed under the regulatory control of a promoter which can be induced or 
repressed, thereby offering a greater degree of control with respect to the level of the 
product produced. 

As used herein, the term "promoter" refers to a sequence of DNA, usually 
upstream (5 T ) of the coding region of a structural gene, which controls the expression of 
the coding region by providing recognition and binding sites for RNA polymerase and 
other factors which may be required for initiation of transcription. Suitable promoters 
are well known in the art Exemplary promoters include the SV40, CMV and human 
elongation factor (EFI) promoters. Other suitable promoters are readily available in the 
art (see, e.g., Ausubel et aL, Current Protocols in Molecular Biology, John Wiley & 
Sons, Inc., New York (1998); Sambrook et aL Molecular Cloning: A Laboratory 
Manual, 2nd edition, Cold Spring Harbor University Press, New York (1989); and U.S. 
Patent No. 5,681,735). 

A DNA sequence of interest can be isolated from nature, modified from native 
sequences or manufactured de novo, as described in, for example, Ausubel et ai., 
Current Protocols in Molecular Biology, John Wiley & Sons, New York (1998); and 
Sambrook et aL, Molecular Cloning: A Laboratory Manual, 2nd edition. Cold Spring 
Harbor University Press, New York. (1989). DNA sequences can be isolated and fused 
together by methods known in the art, such as exploiting and manufacturing compatible 
cloning or restriction sites. 

The packaging cell lines and viral particles of the present invention can be used, 
in vitro, in vivo and ex vivo, to introduce DNA of interest into a eukaryotic cell (e.g., a 
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mammalian cell) or a mammal (e.g., a human or other mammal or vertebrate). The cells 
can be obtained commercially or from a depository or obtained directly from a mammal, 
such as by biopsy. The cells, can be obtained from a mammal to whom they will be 
returned or from another/different mammal of the same or different species. For ~~ 
example, using the packaging cell lines or viral particles of the present invention, DNA 
of interest can be introduced into nonhuman cells, such as pig cells, which are then 
introduced into a human. Alternatively, the cell need not be isolated from the mammal 
where, for example, it is desirable to deliver vial particles of the present invention to the 
mammal in gene therapy. 

Ex vivo therapy has been described, for example, in Kasid et ai, Proc. Natl 
Acad Set USA. 87:473 (1990); Rosenberg et al, M Engl. J. Med. 523:570 (1990); 
Williams Nature, 370:476(1984); DkketaL. Cell, 42:71 (1985); Keller etal, 
Nature. 318:149 (1985); and Anderson etal. United States Patent No. 5,399,346. 

Methods for administering (introducing) viral particles directly to a mammal are 
generally known to those practiced in the art. For example, modes of administration 
include parenteral, injection, mucosal, systemic, implant intraperitoneal, oral, 
intradermal, transdermal (e.g., in slow release polymers), intramuscular, intravenous 
including infusion and/or bolus injection, subcutaneous, topical, epidural, etc. Viral 
particles of the present invention can, preferably, be administered in a pharmaceutically 
acceptable carrier, such as saline, sterile water, Ringer's solution, and isotonic sodium 
chloride solution. 

The dosage of a viral particle of the present invention administered to a 
mammal, including frequency of administration, will vary depending upon a variety of 
factors, including mode and route of adrninistration; size, age, sex, health, body weight 
and diet of the recipient mammal; nature and extent of symptoms of the disease or 
disorder being treated; kind of concurrent treatment, frequency of treatment, and the 
effect desired. 
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The teachings of all the articles, patents, patent applications and GenBank 
sequences cited herein are incorporated by reference in their entirety. 

While this invention has been particularly shown and described with references 
to preferred embodiments thereof, it will be understood by those skilled in the art thaT 
5 various changes in form and details may be made therein without departing from the 
spirit and scope of the invention as defined by the appended claims. 
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CLAIMS 

What is claimed is: 



1 . A packaging cell line for producing a viral accessory protein independent HTV- 
derived retroviral vector particle comprising: 

5 a) a mammalian cell; 

b) a first retroviral nucleotide sequence in the cell which comprises a 
20 coding sequence for a HIV gagpol, wherein said coding sequence has 

been mutagenized to improve expression of the HIV gagpol proteins: 

c) a second retroviral nucleotide sequence in the cell which comprises the 
j o coding sequence for a heterologous envelope protein; and 

25 

<J) a third retroviral nucleotide sequence in the cell which comprises a DNA 
sequence of interest and HTV cis-acting sequences required for 
packaging, reverse transcription and integration. 

30 

2. A packaging cell line of Claim 1 wherein the heterologous envelope protein is 
15 the G glycoprotein of vesicular stomatitis virus (VSV G). 

3. A packaging cell line of Claim 1 wherein the heterologous envelope protein is 
the amphotropic envelope of the Moloney leukemia virus. 

40 4. A packaging cell line of Claim 1 wherein the DNA sequence of interest encodes 

a heterologous therapeutic protein. 



20 5. A packaging cell line comprising: 
a) a mammalian cell; 
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b) a first retroviral nucleotide sequence in the cell which comprises a 
coding sequence for a HIV gagpol, wherein said coding sequence has 
been mutagenized to improve expression of the HIV gagpol proteins; and 

c) a second retroviral nucleotide sequence in the cell which comprises a ~~ 
DNA sequence of interest and HIV cis-acting sequences required for 
packaging, reverse transcription and integration. 

6. A packaging cell line of Claim 5 wherein the DNA sequence of interest encodes 
a heterologous therapeutic protein. 

7. A packaging cell line comprising: 

a) a mammalian cell; 

b) a first retroviral nucleotide sequence in the cell which comprises a 
coding sequence for a HFV gagpol, wherein said coding sequence has 
been mutagenized to improve expression of the HIV gagpol proteins; and 

c) a second retroviral nucleotide sequence in the cell which comprises the 
coding sequence for a heterologous envelope protein. 

8. A method of producing a packaging cell line for producing a viral accessory 
protein independent HIV-derived retroviral vector particle, comprising co- 
transfecting mammalian host cells with: 

a) a first plasmid comprising a DNA sequence which encodes HIV gagpol 
proteins, wherein said DNA sequence has been mutagenized to improve 
expression of the HIV gag and pol proteins; 

b) a second plasmid comprising a DNA sequence which encodes a 
heterologous envelope protein; and 
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c) a third plasmid comprising a DNA sequence of interest and HIV cis- 
acting sequences required for packaging, reverse transcription and 
integration. 

9. A method of Claim 8 wherein the heterologous envelope protein is the G 
glycoprotein of vesicular stomatitis virus (VSV Gy 

10. A method of Claim 8 wherein the heterologous envelope protein is the 
amphotropic envelope of the Moloney leukemia virus. 

11. A method of Claim 8 wherein the DNA sequence of interest is a heterologous 
therapeutic protein. 

12. A method of producing a viral accessory protein independent HIV-derived 
retroviral vector particle comprising co-transfecting mammalian host cells with: 

a) a first plasmid comprising a DNA sequence which encodes HIV gagpol 
proteins, wherein said DNA sequence has been mutagenized to improve 
expression of the HTV gagpol proteins; 

b) a second plasmid comprising a DNA sequence which encodes a 
heterologous envelope protein; and 

c) a third plasmid comprising a DNA sequence of interest and HTV as- 
acting sequences required for packaging, reverse transcription and 
integration. 

13. A method of Claim 12 wherein the heterologous envelope protein is the G 
glycoprotein of vesicular stomauti s virus (V SV G). 
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14. A method of Claim 1 2 wherein the heterologous envelope protein is the 
amphotropic envelope of the Moloney leukemia virus. 

15. A method of Claim 1 2 wherein the DN A sequence of interest encodes a — 
heterologous therapeutic protein. 

16. A packaging cell line for producing a viral accessory protein independent 
lentivirus-derived retroviral vector panicle comprising: 

a) a mammalian cell; 

b) a first retroviral nucleotide sequence in the cell which comprises a 
coding sequence for a lentivirus gagpol, wherein said coding sequence 
has been mutagenized to improve expression of the lentivirus gagpol 
proteins; 

c) a second retroviral nucleotide sequence in the cell which comprises the 
coding sequence for a heterologous envelope protein; and 

d) a third retroviral nucleotide sequence in the cell which comprises a DNA 
sequence of interest and lentivirus cis-acting sequences required for 
packaging, reverse transcription and integration. 

1 7. A packaging cell line of Claim 1 6 wherein the heterologous envelope protein is 
the G glycoprotein of vesicular stomatitis virus (VSV G). 

1 8. A packaging cell line of Claim 1 6 wherein the heterologous envelope protein is 
the amphotropic envelope of the Moloney leukemia virus. 

19. A packaging cell line of Claim 1 6 wherein the DNA sequence of interest 
encodes a heterologous therapeutic protein. 
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20. A packaging cell line comprising: 

a) a mammalian cell; 

b) a first retroviral nucleotide sequence in the cell which comprises a 
coding sequence for lentivirus gogpo/, wherein said coding sequence-has 

5 been mutagenized to improve expression of the lentivirus gagpol 

proteins; and 

c) a second retroviral nucleotide sequence in the cell which comprises a 
DNA sequence of interest and lentivirus cis-acting sequences required 
for packaging, reverse transcription and integration. 

10 21. A packaging cell line of Claim 20 wherein the DNA sequence of interest 
encodes a heterologous therapeutic protein. 

22. A packaging cell line comprising: 

a) a mammalian cell; 

b) a first retroviral nucleotide sequence in the cell which comprises a 

1 5 coding sequence for lentivirus gagpol, wherein said coding sequence has 

been mutagenized to improve expression of the lentivirus gagpol 
proteins; and 

c) a second retroviral nucleotide sequence in the cell which comprises the 
coding sequence for a heterologous envelope protein. 

20 23. A method of producing a packaging cell line for producing a viral accessory 

protein independent lentivirus-derived retroviral vector particle, comprising co- 
trans fee ting mammalian host cells with: 

a) a first plasmid comprising a DNA sequence which encodes lentivirus 
gagpol proteins, wherein said DNA sequence has been mutagenized to 
25 improve expression of the lentivirus gag and pol proteins; 
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b) a second plasmid comprising a DNA sequence which encodes a 
heterologous envelope protein; and 

c) a third plasmid comprising a DNA sequence of interest and lentivirus 
cis-acting sequences required for packaging, reverse transcription and— 

5 integration. 

24. A method of Claim 23 wherein the heterologous envelope protein is the G 
glycoprotein of vesicular stomatitis virus (VSV G). 

25. A method of Claim 23 wherein the heterologous envelope pTOtein is the 
amphotropic envelope of the Moloney leukemia virus. 

10 26. A method of Claim 23 wherein the DNA sequence of interest is a heterologous 
therapeutic protein. 

27. A method of producing a viral accessory protein independent lentivirus-derived 
retroviral vector particle comprising co-transfecting mammalian host cells with: 

a) a first plasmid comprising a DNA sequence which encodes lentivirus 

1 5 gagpol proteins, wherein said DNA sequence has been mutagenized to 

improve expression of the lentivirus gagpol proteins; 

b) a second plasmid comprising a DNA sequence which encodes a 
heterologous envelope protein; and 

c) a third plasmid comprising a DNA sequence of interest and lentivirus 
20 cis-acting sequences required for packaging, reverse transcription and 

integration. 



28. A method of Claim 27 wherein the heterologous envelope protein is the G 
glycoprotein of vesicular stomatitis virus (VSV G). 
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29. A method of Claim 27 wherein the heterologous envelope protein is the 
amphotropic envelope of the Moloney leukemia virus. 

30. A method of Claim 27 wherein the DNA sequence of interest encodes a — 
heterologous therapeutic protein. 

5 31. A viral accessory protein independent HI V-derived retroviral vector particle 

produced by the method comprising co-transfecting mammalian host cells with: 
a) a first plasmid comprising a DNA sequence which encodes HIV gagpol 

proteins, wherein said DNA sequence has been mutagenized to improve 

expression of the HIV gagpol proteins; 
10 b) a second plasmid comprising a DNA sequence which encodes a 

heterologous envelope protein; and 
c) a third plasmid comprising a DNA sequence of interest and HIV cis- 

acting sequences required for packaging, reverse transcription and 

integration. 

15 32. A method of Claim 3 1 wherein the heterologous envel ope protein is the G 
glycoprotein of vesicular stomatitis virus (VSV G). 

33. A method of Claim 3 1 wherein the heterologous envelope protein is the 
amphotropic envelope of the Moloney leukemia virus. 



34. 

20 



A method of Claim 3 1 wherein the DNA sequence of interest encodes a 
heterologous therapeutic protein. 
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35. A viral accessory protein independent lentivirus-derived retroviral vector 
particle produced by the method comprising co-transfecting mammalian host 
cells with: 

a) a first plasmid comprising a DNA sequence which encodes lentivirus— 
gagpol proteins, wherein said DNA sequence has been mutagenized to 
improve expression of the lentivirus gagpol proteins; 

b) a second plasmid comprising a DNA sequence which encodes a 
heterologous envelope protein; and 

c) a third plasmid comprising a DNA sequence of interest and lentivirus 
cis-acting sequences required for packaging, reverse transcription and 
integration. 

36. A method of Claim 35 wherein the heterologous envelope protein is the G 
glycoprotein of vesicular stomatitis virus (VSV G). 

37. A method of Claim 35 wherein the heterologous envelope protein is the 
amphotropic envelope of the Moloney leukemia virus. 

38. A meihod of Claim 35 wherein the DNA sequence of interest encodes a 
heterologous therapeutic protein. 

39. Isolated DNA encoding a codon optimized HIV gagpol. 

40. Isolated DNA encoding a codon optimized HIV gag. 

41 . Isolated DNA of Claim 40 comprising the nucleotide sequence of SEQ ID NO:4. 

42. Isolated DNA encoding a codon optimized HIV pot. 
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43. Isolated DNA of Claim 42 comprising the nucleotide sequence of SEQ ID 
NO: 10. 

44. A method of introducing a DNA sequence of interest into a mammal comprising 
introducing into said mammal a viral accessory protein independent HIV- 
derived retroviral vector particle comprising the DNA sequence of interest. 

45. The method of Claim 44 wherein the mammal is a human. 

46. The method of Claim 44 wherein the DNA sequence of interest encodes a 
heterologous therapeutic protein. 

47. A method of introducing a DNA sequence of interest into a mammal comprising 
the steps of: 

a) introducing into cells a viral accessory protein independent HIV-derived 
retroviral vector particle comprising the DNA sequence of interest; and 

b) returning the cells obtained in step a) to the mammal. 

48. The method of Claim 47 wherein the mammal is a human. 



49. The method of Claim 47 wherein the DNA sequence of interest is a heterologous 
therapeutic protein. 
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cag Gln(Q) 


gaa Ulu(fc) 
gag Glu(E) 
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Rev 

• Regulates HIV gene expression by promoting 
cytoplasmic levels of unspliced and singly spliced 
mRNAs 

• Postulated to affect splicing, stability, transport, 
and translation 
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Codon Optimization of HTV gagpol 

• Remove A-rich instability elements 

• Improve translation al efficiency 

• Reduce risk of recombination with 
transfer vector 



Fig. 5 
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Afignment Report of Codon optimization (gag). MEG, using dustaJ molhod with PAM250 residue weight table. 
810 

. , i 

792 MGARASVLSGGELDK NL4-3 genbank.SEQ 
792 ATG GGT GCG AGA GCG TCG GTA TTA AGC GGG GGA GAA TTA GAT AAA 
1319 MG ARAS VLSGGELDK pHDMHgpm2 . seq 
1319 ATG GGC GCC CSC GCC TCC GTG CTG TCC GGC GGC GAG CTG GAC AAG 

1 ■ 1 — 

840 870 

: . — — i 

* 37 WEKIRLRPGGKKQYK NL4-3 aenbank.SEQ 
837 TGG GAA AAA ATT CGG TTA AGG CCA GGG GGA AAG AAA CAA TAT AAA 
1364 WEKIRLRPGGKKQYK pKDMHgpm2 . seq 
1364 TGG GAG AAG ATC CGC CTG CGC CCC GGC GGC AAG AAG CAG TAC AAG 

r— — ■ — ■ 

900 

1 

882 LKBIVWASRELERFA NL4-3 genbank.SEQ 
B82 CTA AAA CAT ATA GTA TGG GCA AGC AGG GAG CTA GAA CGA TTC GCA 
1409 LKHIVWA5RELERFA ?HCMKgpm2 . sec 
1409 CTG AAG CAC ATC GTG TGG GCC TCC CGC GAG CTG GAG CGC TTC GCC 

1 1 

930 960 

927 VNPGLLETSEGCRQI NL4-3 genbank.SEQ 
927 GTT AAT CCT GGC CTT TTA GAG ACA TCA GAA GGC TGT AGA CAA ATA 
1454 VWPGLLETSEGCRQI pHDMHgpm2 . a eq 
1454 GTG AAC CCC GGC CTG CTG GAG ACC TCC GAG GGC TGC CCC CAG ATC 

, 

990 

l 

972 L G Q L Q P 5LQTGSEEL NL4-3 genbank.SEQ 
972 CTG GGA CAG CTA CAA CCA TCC CTT CAG ACA GGA TCA GAA GAA CTT 
1499 L G Q L Q P S LQTG5SEL pHDMHgpm2 . seq 
1499 CTG GGC CAG CTG CAG CCC TCC CTG CAA ACC GGC TCC GAG GAG CTG 

. .... , — — — 

1020 1050 

i - > 

1017 aSLYSTXAVLYCVHQ NL4-3 genbank.SEQ 

1017 AGA TCA TTA TAT AAT ACA ATA GCA GTC CTC TAT TGT GTG CAT CAA 

1544 RSLYHT IAVLYCVHQ pHDMHgpm2 . seq 

1544 CGC TCC CTG TAC AAC ACC ATC GCC GTG CTG TAC TGC GTG CAC CAG 

1080 

1 „ 

1062 RIDVKDTKEALDKIE NL4-3 genbank.SEQ 
1062 AGG ATA GAT GTA AAA GAC ACC AAG GAA GCC TTA GAT AAG ATA GAG 
1589 RIDVKDTKEALDXIE pHDMHgpm2 . seq 
1589 CGC ATC GAC GTG AAG GAC ACC AAG GAG GCC CTG GAC AAG ATC GAG 

I — I 

1110 1140 

' I I . 

1107 EEQNKSKKKAQQAAA WL4-3 genbank.SEQ 
1107 GAA GAG CAA AAC AAA. ACT AAG AAA AAG GCA CAG CAA GCA GCA GCT 
1634 EEQNKSKKKAQQAAA P HBMHgpra2 .seq 
1634 GAG GAG CAG AAC AAG TCC AAG AAG AAG GCC CAG CAG GCC GCC GCC 
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Alignment Report of Codon optimization (gag). MEG. using <3usta> method with PAM250 residue weight table. 



1170 



1152 DTGWNSQVSQKVPIV NL4-3 genbank.SEQ 
1152 GAC ACA GGA AAC AAC AGC CAG GTC AGC CAA AAT TAC CCT ATA GTG 
1679 DTGNNSQVSQNYPIV P HDMHgpm2 . seq 
1679 GAC ACC GGC AAC AAC TCC CAG GTG TCC CAG AAC TAC CCC ATC CTG — 

1 _ , 

1200 12 , 30 

1197 QNLQGQMVHQAISPR KL4-3 genbank.SEQ 
1197 CAG AAC CTC CAG GGG CAA ATG GTA CAT CAG GCC ATA TCA CCT AGA 
1724 QNLQGQMVKQAISPR pHDMHgpntf . seq 
1724 CAG AAC CTG CAG GGC CAG ATG GTG CAC CAG GCC ATC TCC CCC CGC 

1260 

• I - 

12 42 TLNAWVKVVEEKAFS NX4-3 genbank.SEQ 

1242 ACT TTA AAT GCA TGG GTA AAA GTA GTA GAA GAG AAG GCT TTC AGC 

1769 TLNAWVKVVEEKAFS pHDMHgpm2 . seq 

1769 ACC CTG AAC GCC TGG GTG AAG GTG GTG GAG GAG AAG GCC TTC TCC 

1290 1£2<) 

1287 ?EVI PMFSALSEGAT NL4-3 genbank.SEQ 
1287 CCA GAA GTA ATA CCC ATG TTT TCA GCA TTA TCA GAA GGA GCC ACC 
1814 PEVIPMFSALSEGAT pHDMHgpia2 . seq 
1814 CCC GAA GTC ATC CCC ATG TTC TCC GCC CTG TCC GAG GGC GCC ACC 

1350 

1332 PQDLNTMLMTVGGKQ N14-3 genbank.SEQ 
1332 CCA CAA GAT TTA AAT ACC ATG CTA AAC ACA GTG GGG GGA CAT CAA 
1859 PQDLNT MLMTVGGHQ pHDMHg?m2 - seq 
1859 CCC CAG GAC CTG AAC ACC ATG CTG AAC ACC GTG GGC GGC CAC CAG 



1380 



1 

1410 



1377 AAMQMLKETItfESAA NL4-3 genbank.SEQ 
1377 GCA GCC ATG CAA ATG TTA AAA GAG ACC ATC AAT GAG GAA GCT GCA 
1904 AAMQMLKETINEEAA pHDMHgpn* . sec 
1904 GCC GCC ATG CAG ATG CTG AAG GAG ACC ATC AAC GAG GAG GCC GCC 

I 

1440 

1422 EWDRLHPVHAGPIAP NL4-3 genbank.SEQ 
1422 GAA TGG GAT AGA TTG CAT CCA GTG CAT GCA GGG CCT ATT GCA CCA 
1949 EWORLHPVHAGP IAP pHDMHgpm2 . seq 
1949 GAG TGG GAC CGC CTG CAC CCC GTG CAC GCC GGC CCC ATC GCC CCC 



1470 



1500 



1467 G QMREPRGSDIAGTT NL4-3 genbank.SEQ 

1467 GGC CAG ATG AGA GAA CCA AGC GGA AGT GAC ATA GCA GGA ACT ACT 

1994 GQMREPRGSDIAGTT ?HDMEgpm2 . seq 

1994 GGC CAG ATG CGC GAG CCC CGC GGC TCC GAC ATC GCC GGC ACC ACC 
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Alignment Report of Codon optimization (gag). MEG, using Clustai method with PAM250 residue weight table. 



1530 

i 

1512 STLQEQIGWMTHNPP NL4-3 genbank.SEQ 
1512 AGT ACC CTT CAG GAA CAA ATA GGA TGG AT6 ACA CAT AAT CCA CCT 
2039 STLQEQIGWMTHN PP pHDMHgpm2 . seq 
2039 TCC ACC CTG CAA GAG CAG ATC GGC TGG ATG ACC CAC AAC CCC CCC 

1 ~ ' 

1560 1590 

I 1 

1557 IPVGEIYKRWIILGL NL4-3 genbank.SEQ 
1557 ATC CCA GTA GGA GAA ATC TAT AAA AGA TGG ATA ATC CTG GGA TTA 
2084 IPVGEIYKR WIILGL pHDMHgpm2. seq 
20B4 ATC CCC GTG GGC GAG ATC TAC AAG CGC TGG ATC ATC CTG GGC CTG 

1620 

1602 NXIVRMYS PTSILDI NL4-3 genbank.SEQ 
1602 AAT AAA ATA GTA AGA ATG TAT AGC CCT ACC AGC ATT CTG GAC ATA 
2129 NKIVRMYS PTSILDI pHDM3gpm2.seq 
2129 AAC AAG ATC GTG CGC ATS TAC . TCC CCC ACC TCC ATC CTG GAC ATC 

, I 

1650 ___ 

1647 RQGPKEPFRDYVDRF NL4-3 genbank.SEQ 
1647 AGA CAA GGA CCA AAG GAA CCC TTT AGA GAC TAT GTA GAC CGA TTC 
2174 RQGPKEPFRDYVDRF pHDMHgpm2 • seq 
2174 CGC CAG GGC CCC AAG GAG CCC TTC CGC GAC TAC GTG GAC CGC TTC 



1710 

1692 YKTLRAEQASQEVKN NL4-3 genbank.SEQ 
1692 TAT AAA ACT CTA AGA GCC GAG CAA GCT TCA CAA GAG GTA AAA AAT 
2219 YKTLRAEQASQEVKN pKDMHgpm2 .seq 
2219 TAC AAG ACC CTG CGC GCC GAG CAG GCC TCC CAG GAG GTA AAG AAC 



1740 : 

1737 WMTETLLVQNANPDC NL4-3 genbank.SEQ 
1737 TGG ATG ACA GAA ACC TTG TTG GTC CAA AAT GCG AAC CCA GAT TGT 
2264 WMTETLLVQNANPDC pHDMHgpta2 . seq 
22 64 TGG ATG ACC GAG ACC CTG CTG GTG CAG AAC GCC AAC CCC GAC TGC 

1800 

1782 KTI LKALGPGATLEE NL4-3 genbank.SEQ 
1782 AAG ACT ATT TTA AAA GCA TTG GGA CCA GGA GCG ACA CTA GAA GAA 
2309 KTI LKALGPGATLEE pHDHHgpm2 .seq 
2309 AAG ACC ATC CTG AAG GCC CTG GGC CCC GGC GCC ACC CTG GAG GAG 

f 1 

1830 1860 
1827 MMTACQGVGGPGHKA NL4-3 genbank.SEQ 
1827 ATG ATG ACA GCA TGT CAG GGA GTG GGG GGA CCC GGC CAT AAA GCA 
2354 MMTACQGV GGPGHKA pHDMHgpm2 . seq 
2354 ATG ATG ACC GCC TGC CAG GGC GTG GGC GGC CCC GGC CAC AAG GCC 



Fig. 8C 
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Alignment Report of Codon optimization (sag).MEG, using Clustal method with PAM250 residua weight table. 

, 

1890 

1872 RVLAEAMSQVTNPAT ITL4-3 genbank.SEG. 
1872 AGA GTT TTG GCT GAA GCA ATG AGC CAA GTA ACA AAT CCA GCT ACC 

2399 RVLAEAMSQVTNPAT pHDMKgpm2 . secj— 
2399 CGC GTG CTG GCC GAG GCC ATG TCC CAA GTC ACC AAC CCC GCC ACC 

, — — ■ 1 1 

1920 19S0 

1917 IMI QKGM FRNQRXTV NL4-3 genbank.SEQ 
1917 ATA ATG ATA GAG AAA GGC AAT TTT AGG AAC CAA AGA AAG ACT GTT 
2444 IMIQKGN FRNQ3HTV ? HDMHgpm2 . seq 
2444 ATC ATG ATC CAG AAG GGC AAC TTC CGC AAC CAG CGC AAG ACC GTG 



1980 

1.962 K C i 5 C G K E G H I A K S C~~ NL4-3 genbank.SEQ 
1962 AAG TGT TTC AAT TGT GGC AAA GAA. GGG CAC ATA GCC AAA AAT TGC 
2489 KCFNCGKEGHIAKNC pEDKHgpm2.seq 
2489 AAG TGC TTC AAC TGC GGC AAG GAG GGC CAC ATC GCC AAG AAC TGC 



2010 

2007 RAP RKKGCWKCGKEG NL4-3 ger-bank.SEQ 
2007 AGG GCC CCT AGG AAA AAG GGC TGT TGG AAA TGT GGA AAG GAA GGA 
2534 RAP RKKGCWKCGKEG pHDMEgpm2 .seq 
2534 CGC GCC CCC CGC AAG AAG GGC TGC TGG AAG TGC GGC AAG GAG GGC 



• 2070 

, ! — , 

2052 HQMXDCTERQAHFLG NL4-3 genbank.SEQ 
2052 CAC CAA ATG AAA GAT TGT ACT GAG AGA CAG GCT AAT TTT TTA GGG 
2579 HQMKDCT SRQANFLG pHDMHgpm2 .seq 
2579 CAC CAG ATG AAA GAT TGT ACT GAG AGA CAG GCT AAT TTT TTA GGG 

, _ 1 

2100 

2C97 KIWPSHXGRPGNF-Q NL4-3 genbank.SEQ 
2097 AAG ATC TGG CCT TCC CAC AAG GGA AGG CCA GGG AAT TTT CTT CAG 
2624 KIWPSHKGRPGNFLQ pEDMHgpm2 . seq 
2624 AAG ATC TGG CCT TCC CAC AAG GGA AGG CCA GGG AAT TTT CTT CAG 



2160 

t — 

2142 SRPEPTAPPEESFRF HL4-3 genbank.SEQ 
2142 AGC AGA CCA GAG CCA ACA GCC CCA CCA GAA GAG AGC TTC AGG TTT 
2669 SRPEPTAPPEESFRF pHDMEgpn>2.seq 
2669 AGC AGA CCA GAG CCA ACA GCC CCA CCA GAA GAG AGC TTC AGG TTT 



2190 . 2220 

2187 GEETTTPSQKQSPID ML4-3 genbank.SEQ 
2197 GGG GAA GAG ACA ACA ACT CCC TCT CAG AAG CAG GAG CCG ATA CAC 
2714 GEETTTPSQKQEPID pHDMHgpm2 . seq 
2714 GGG GAA GAG ACA ACA ACT CCC TCT CAG AAG CAG GAG CCG ATA GAC 



Fig. 8D 
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Afignment Report of Codon opfimization (gag). MEG, using Clustal method with P AM 250 resfduo weight table. 

» 

2250 

1 

2232 KELYPLASLRSLrGS NL4-3 genbank.SEQ 
2232 AAG GAA CTG TAT CCT TEA GCT TCC CTC AGA TCA CTC TTT GGC AGC 
2759 KELYPLASLRSLFGS pHDMHgpm2.s£o; 
2759 AAG GAA CTG TAT CCT TTA GCT TCC CTC AGA TCA CTC TTT GGC AGC 



i 

2280 



2277 D P S S Q ^r" : NL4-3 genbank.SEQ 

2277 GAC CCC TCG TCA CAA TAA 

2804 D P S 5 Q .^J pHDMHgpm2 . a eq 

2804 GAC CCC TCG TCA CAA TAA 
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Alignnwnl Report of Codon Optimization (poJ).MEG. using Ousted method with PAM250 residue weight table. 

- r- ■ r ■ 1 

2090 ^ 

20B7 r ' F r £ 5 L A P P 0 G X A R E~ NL4-3 genbank.SEQ 

2087 TTT TTT AGG GAA GAT CTG GCC TTC CCA CAA GCG AAG GCC AGG GAA - 

2085 FFRED LAFPQGKARE pNL4-3.seq 

2085 TTT TTT AGG GAA GAT CTG GCC TTC CCA CAA GGG AAG GCC AGG GAA 

2612 FFREDLAFPQGKARE ?HDMHgpm2 . seq 

2612 TTT TTT AGG GAA GAT CTG GCC TTC CCA CAA GGG AAG GCC AGG GAA 

2150 

2132 — 1 1 q A N S ? * * * *~ NL4-3 genbank.SEQ 

2132 T^ TCT TCA GAG CAG ACC AGA GCC AAC AGC CCC ACC AGA AGA GAG 
2130 F*SSEG;TRAMS?TRRE pNI.4-3.3eq 
2130 TTT TCT TCA GAG CAG ACC AGA GCC AAC AGC CCC ACC AGA AGA GAG 
2657 f'ssEQTRAMSPTRRE pHDMHgpm2 . sec 
TTT TCT TCA GAG CAG ACC AGA GCC AAC AGC CCC ACC AGA AGA GAG 



2657 



2130 



, 

2210 



2177 ' L Q y w g R D K N S L S £ A G NL4-3 genbank.SEQ 

2177 CTT CAG GTT TGG GGA AGA GAC AAC AAC TCC CTC TCA GAA GCA GGA 

2175 LQVWGRDKHSLSEAG pNL4-3.seq 

2175 CTT CAG GTT TGG GGA AGA GAC AAC AAC TCC CTC TCA GAA GCA GGA 

2702 TqVWGRDNHSLSEAG pHDMHgpm2 . seq 

2702 CTT CAG GTT TGG GGA AGA GAC AAC AAC TCC CTC TCA GAA GCA GGA 



2240 

ADRQgT ' V3 r5rPQI? NL4-3 genbank.SEQ 
2222 GCC GAT AGA CAA GGA ACT GTA TCC TTT AGC TTC CCT CAG ATC ACT 
2220 A D R 0 G T V S F 5 F P Q I T pNL4-^. seq 
2220 GCC GAT AGA CAA GGA ACT GTA TCC TTT AGC TTC CCT CAG ATC ACT 
2747 A DRQGTVSFSFPQIT ?KDMHgpa2 . sec 
2747 GCC GAT AGA CAA GGA ACT GTA TCC TTT AGC TTC CCT CAG ATC ACT 



2270 
_l 



1 

2300 



2267 L W Q R P L v 

2267 CTT TGG CAG CGA CCC CTC GTC ACA ATA AAG ATA GGG GGG CAA TTA ■ 

22« 7 T Q R P t V T I K I « G Q L pNL4-3.3«q 

2265 CTT TGG CAG CGA CCC CTC GTC ACA ATA AAG ATA GGG GGG CAA TTA 
2792 LWQRPI.VTIKIGGQL pHDMHgpm2 . sec 

2792 CTT TGG CAG CGA CCC CTC GTC ACA ATA AAG ATC GGT GGC CAG CTG 

_ — 1 ~ 

2330 

— A I " 5 ' T G A o D T V Zs E NL4-3 genbank.SEQ 

2312 AAG GAA GCT CTA TTA GAT ACA GGA GCA GAT GAT ACA GTA TTA GAA 

2310 KEALLDTGADDTVuE ?NL4-3.seq 

2310 AAG GAA GCT CTA TTA GAT ACA GGA GCA GAT GAT ACA GTA TTA GAA 

2310 AAG*** Y LLDTGAODTVLE pHEMHgpm2 . seq 

2937 AAG GAG GCC CTG CTG GAC ACC GGC GCC GAC GAC ACC GTG CTG GAG 



jjKlGGQI* N14-3 genbank.SEQ 



Fig. 9A 
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Alignment Report of Codon Optimization (pol).MEG. using duslal meJhod with PAM250 residue weight table. 



k 

2360 



i 

2390 
1 



2357 EMNLPGRWKPKMI GG 

2357 GAA ATG AAT TTG CCA GGA AGA TGG AAA CCA AAA ATG ATA GGG GGA 

2355 EMNLPGRWKPKMIGG 

23S5 GAA ATG AAT TTG CCA GGA AGA TGG AAA CCA AAA ATG ATA GGG GGA 

2882 EMNLPGRWKPKMIGG 

2882 GAG ATG AAC CTG CCC GGC CGC TGG AAG CCC AAG ATG ATC GGC GGC 



2420 



2402 
2402 
2400 
2400 
2927 
2927 



2447 



I G G F I K V V G Q Y OQ I L I 
ATT GGA GGT TTT ATC AAA GTA GGA CAG TAT GAT CAG ATA CTC ATA 

I G G F I KVRQYDQI LI 
ATT GGA GGT TTT ATC AAA GTA AGA CAG TAT GAT CAG ATA CTC ATA 

IGGFIXVRQYDQILI 
ATC GGC GGC TTC ATC AAA GTC CGC CAG TAC GAC CAG ATC CTG ATC 



2450 



2480 



EICGHKAIGTVLVGP 
2447 GAA ATC TGC GGA CAT AAA GCT ATA GGT ACA GTA TTA GTA GGA CCT 
2445 EICGHKAIGTVLVGP 
2445 GAA ATC TGC GGA CAT AAA GCT ATA GGT ACA GTA TTA GTA GGA CCT 
2972 EICGHKAIGTVLVGP 
2972 GAG ATC TGC GGC CAC AAG GCC ATC GGC ACC GTG CTG GTG GGC CCC 

- i "" — 

2510 

i 

2492 TPVNI I GRNLLTQI G 
2492 ACA CCT GTC AAC ATA ATT GGA AGA AAT CTG TTG ACT CAG ATT GGC 
2490 TPVNIIGRNLLTOrl G 
2490 ACA CCT GTC AAC ATA ATT GGA AGA AAT CTG TTG ACT CAG ATT GGC 
3017 TPVNI I GRNLLTQIG 
3017 ACC CCC GTG AAC ATC ATC GGC CGC AAC CTG CTG ACC CAG ATC GGC 



J 

2540 



2570 



2537 CTLNFPISPIBTVPV 
2537 TGC ACT TTA AAT TTT CCC ATT AGT CCT ATT GAG ACT GTA CCA GTA 
CTLNFPISPIETVPV 
TGC ACT TTA AAT TTT CCC ATT AGT CCT ATT GAG ACT GTA CCA GTA 

CTLNFPISPIBTVPV 
TGC ACC CTG AAC TTC CCC ATC TCC CCC ATC GAG ACC GTG CCC GTG 



2535 
2535 
3062 
3062 



2600 



KLKPGMDGPXVKQWP 
AAA TTA AAG CCA GGA ATG GAT GGC CCA AAA GTT AAA CAA TGG CCA 
KLKPG MDGPKVKQWP 
2580 AAA TTA AAG CCA GGA ATG GAT GGC CCA AAA GTT AAA CAA TGG CCA 
3X07 KLKPGM DGPKVKQWP 
AAG CTG AAG CCC GGC ATG GAC GGC CCC AAA GTC AAG CAG TGG CCC 



2582 
2582 
2580 



3107 



NL4-3 genbank.SEQ 
pNL4-3.se3„ 
pHUMHgpm2. seq 

NL4-3 genbank.SEQ 
pNL4-3. seq 
pHDMHgpm2 . seq 

NL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgpm2.seq 

NL4-3 genbank.SEQ 
pNL4-3.seq 
pEDMHgpm2.seq 

NL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgpo2 . seq 

NL4-3 genbank.SEQ 
pNL4-3.seq 
pHBMKgpra2. seq 
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Alignment Report of Codcn Optimization {pcW).MEG. using Ctustat meihod with PAM250 residue weight (able. 



2630 2660 

2627 ~L T S i K I K A I V F^S C T E~~ m.4-3 genbairic.SEQ 

2627 TTG ACA GAA GAA AAA ATA AAA GCA TTA GTA GAA ATT TGT ACA GAA 

2625 LTBBKIXALVEICTE pKI.4-3.seq 

262S TTG ACA GAA GAA AAA ATA AAA GCA TTA GTA GAA ATT TGT ACA GAA 

3152 LTEEXIKALVEICTE pHDMHgpra2.seq 

31S2 CTG ACC GAG GAG AAG ATC AAG GCC CTG GTG GAG ATC TGC ACC GAG 

2690 

2672 ~M E K i G K I i K I G ? i N ~ ML4-3 genbanJc.SEQ 

2672 ATG GAA AAG GAA GGA AAA ATT TCA AAA ATT GGG CCT GAA AAT CCA 

2670 MEKEGKISKIGPENP D NL4-3.seq 

2670 ATG GAA AAG GAA GGA AAA ATT TCA AAA ATT GGG CCT GAA AAT CCA 

3197 MEKEGKISKIGPENP oHDMHgpm2.seq 

3197 ATG GAG AAG GAG GGC AAG ATC TCC AAG ATC GGC CCC GAG AAC CCC 

!— 1 

2720 2750 

2717 YHTPVFAIKKKDSTK NL4-3 genbank.SEQ 

2717 TAC AAT ACT CCA GTA TTT GCC ATA AAG AAA AAA GAC ACT ACT AAA 

2715 YHTPVFAIKKXD3TX pNL4-3.seo 

2715 TAC AAT ACT CCA GTA TTT GCC ATA AAG AAA AAA GAC AGT ACT AAA 

3242 YWTPVFAIXKKDSTK DRDMHgpm2 . seq 

3242 TAC AAC ACC CCC GTG TTC GCC ATC AAG AAG AAG GAC TCC ACC AAG 





2 


780 
— I 










2762 


W R K L V D 


F 


R 


E L 


M K R T Q 


NL4-3 genbanJc.SEQ 


2762 


TGG AGA AAA TTA GTA GAT 




AGA 


GAA CTT 


AAT AAG AGA ACT CAA 




2760 


W R X L V D 


r 


R 


E L 


N K R T Q 


pNI.4-3. seq 


2760 


TGG AGA AAA TTA GTA GAT 


TTC 


AGA 


GAA CTT 


AAT AAG AGA ACT CAA 




3287 


W R K L V 0 


F 


R 


E L 


H X R T Q 


pKDMHgptn2 . seq 


3297 


TGG CGC AAG CTG GTG GAC 


TTC 


CGC 


GAG CTG 


AAC AAG CGC ACC CAG 






2010 








234 C 





2307 OFWEVQLGIPXPAGL ML4-3 genbank.SEQ 

2807 GAT TTC TGG GAA GTT CAA TTA GGA ATA CCA CAT CCT GCA GGG TTA 

2805 DFWEVQLGIPHPAGL pW14-3.sec 

2805 GAT TTC TGG GAA GTT CAA TTA GGA ATA CCA CAT CCT GCA GGG TTA 

3332 DFWEVQLGIPHPAGL pHDMEgpm2. seq 

3332 GAC TTC TGG GAG GTG CAG CTG GGC ATC CCC CAC CCC GCC GGC CTG 



„ t 

2352 KCXXS VTVLDVGDAY HW-3 genbank.SEQ 

2852 AAA CAG AAA AAA TCA GTA ACA GTA CTG GAT GTG GGC GAT GCA TAT 

2850 KQKKSVTVLDVGDAY pKL4-3. seq 

2850 AAA CAG AAA AAA TCA GTA ACA GTA CTG GAT GTG GGC GAT GCA TAT 

3377 KQXKSVTVLDVGDAY oHDMHgpm2.seq 

3377 AAG CAG AAG AAG TCC GTG ACC GTG CTG GAC GTG GGC GAC GCC TAC 
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Alignment Report of Codcn Optimization (potyMEG. using Cfusta) method with PAM250 residue weight table. 

2900 2930 

I i „ 

2897 FSVPLDKDFRKYTAF NL4-3 genbank.SEQ 

2897 TTT TCA GTT CCC TTA GAT AAA GAC TTC AGG AAG TAT ACT GCA TTT 

2895 FS VPLDKDFRKYTAF pNI.4-3.3eq 

2895 TTT TCA GTT CCC TTA GAT AAA GAC TTC AGG AAG TAT ACT GCA TTT 

3422 FSVPLDKDFRKYTAF pHDMKgpm2 . seq 

3422 TTC TCC GTG CCC CTG GAC AAG GAC TTC CGC AAG TAC ACC GCC TTC 

i 

2960 

1 : ■ 

2942 TIPSINNETPGIRYQ NL4-3 genbank.SEQ 

2942 ACC ATA CCT ACT ATA AAC AAT GAG ACA CCA GGG ATT AGA TAT CAG 

2940 TIPSINNETPGIRYQ pNL4-3.seq 

2940 ACC ATA CCT AGT ATA AAC AAT GAG ACA CCA GGG ATT AGA TAT CAG 

3467 TIPSINNETPGIRYQ pHDMHgpm2 . seq 

3467 ACC ATC CCC TCC ATC AAC AAC GAG ACC CCC GGC ATC CGC TAC CAG 



2990 3020 

I i . 

2987 yMVLPQGWKGSPAIF NL4-3 genbank.SEQ 

2987 TAC AAT GTG CTT CCA CAG GGA TGG AAA GGA TCA CCA GCA ATA TTC 

2985 YNVLPQGWKGSPAIF pNL4-3.seq 

29B5 TAC AAT GTG CTT CCA CAG GGA TGG AAA GGA TCA CCA GCA ATA TTC 

3512 YNVLPQGWKGSPAIF pHDMHgpm2 . seq 

3512 TAC AAC GTG CTG CCC CAG GGC TGG AAG GGC TCC CCC GCC ATC TTC 

3050 

3032 QCSMTKILEPFRKQN NL4-3 genbank.SEQ 

3032 CAG TGT AGC ATG ACA AAA ATC TTA GAG CCT TTT AGA AAA CAA AAT 

3030 QCSMTKILEPFRKQN pNL4-3.seq 

3030 CAG TGT AGC ATG ACA AAA ATC TTA GAG CCT TTT AGA AAA CAA AAT 

3557 QCSMTKILEPFRKQN pHDMHgpm2 . seq 

3557 CAG TGC TCC ATG ACC AAG ATC CTG GAG CCC TTC CGC AAG CAG AAC 

3080 3110 

I L . 

3077 PDI VI YQYMDDLYVG NL4-3 genbank.SEQ 

3077 CCA GAC ATA GTC ATC TAT CAA TAC ATG GAT GAT TTG TAT GTA GGA 

3075 PDI VI YQYMDDLYVG pNL4-3.seq 

3075 CCA GAC ATA GTC ATC TAT CAA TAC ATG GAT GAT TTG TAT GTA GGA 

3602 P DI V I Y Q YMD DLYVG pHDMHgpm2 . s eq 

3602 CCC GAC ATC GTG ATC TAC CAG TAC ATG GAC GAC CTG TAC GTG GGC 



3140 

' . 

3122 SDLE2GQHRTKIEEL NL4-3 genbank.SEQ 

3122 TCT GAC TTA GAA ATA GGG CAG CAT AGA ACA AAA ATA GAG GAA CTG 

3120 SDLEIGQHRTKIEEL pNL4-3.seq 

3120 TCT GAC TTA GAA ATA GGG CAG CAT AGA ACA AAA ATA GAG GAA CTG 

3647 SDLEIGQHRTKIEEL pHDMHgpm2 . seq 

3647 TCC GAC CTG GAG ATC GGC CAG CAC CGC ACC AAG ATC GAG GAG CTG 
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Afignment Report of Codon Optimization (poJ).MEG, using CJustaJ method with PAM250 residue weight fable 



3170 



3167 R . 



H 



3200 
-J 



LRWGFTTPDKK 
3167 AGA CAA CAT CTG TTG AGG TGG GGA TTT ACC ACA CCA GAC AAA AAA 
3165 RQHLLRW GFTTPDKK 
3165 AGA CAA CAT CTG TTG AGG TGG GGA TTT ACC ACA CCA GAC AAA AAA 
3692 RQHLLRWGFTTPDKK 
3692 CGC CAG CAC CTG CTG CGC TGG GGC TTC ACC ACC CCC GAC AAG AAG 



3230 



3212 H Q K E P P F L W M G Y E L 5~~ 

3212 CAT CAG AAA GAA CCT CCA TTC CTT TGG ATG GGT TAT GAA CTC CAT 
3210 HQKEPPFLWMGYELH 

3210 CAT CAG AAA GAA CCT CCA TTC CTT TGG ATG GGT TAT GAA. CTC CAT 
3737 HQKEPPFLWMG YELH 

3737 CAC CAG AAG GAG CCC CCC TTC CTG TGG ATG GGC TAC GAG CTG CAC 



3257 



3260 
1 



3290 



DKWTVQPIVLPEKD 
3257 CCT GAT AAA TGG ACA GTA CAG CCT ATA GTG CTG CCA GAA AAG GAC 
3255 PDKWTVQPIVLPEKD 
3255 CCT GAT AAA TGG ACA GTA CAG CCT ATA GTG CTG CCA GAA AAG GAC 
3782 PDKWTVQPIVLPEKD 
3702 CCC GAC AAG TGG ACC GTG CAG CCC ATC GTG CTG CCC GAG AAG GAC 



3320 



3302 



SWTVNDIQKLVGKLN 
3302 AGC TGG ACT GTC AAT GAC ATA CAG AAA TTA GTG GGA AAA TTG AAT 
3300 SWTVNDI QKLV GKLN 
3300 AGC TGG ACT GTC AAT GAC ATA CAG AAA TTA GTG GGA AAA TTG AAT 
3327 SWTVNDIQKLVGKLN 
3827 TCC TGG ACC GTG AAC GAC ATC CAG AAG CTG GTG GGC AAG CTG AAC 



3350 



3380 



3347 WASQIYAGIXVRQLC 

3347 TGG GCA AGT CAG ATT TAT GCA GGG ATT AAA GTA AGG CAA TTA TGT 

3345 WASQIYASIKVRQLC 

3345 TGG GCA AGT CAG ATT TAT GCA GGG ATT AAA GTA AGG CAA TTA TGT 

3872 WASQIYAGIKVRQLC 

3872 TGG GCC TCC CAG ATC TAC GCC GGC ATC AAA GTC CGC CAG CTG TGC 



3392 



3410 
J. 



KLLRGTKALTEVV ^ L™ 
3392 AAA CTT CTT AGG GGA ACC AAA GCA CTA ACA GAA GTA GTA CCA CTA 
3390 KLLRGTKALTEVVPL 
3390 AAA CTT CTT AGG GGA ACC AAA GCA CTA ACA GAA GTA GTA CCA CTA 
3917 KLLRGTKALTEVVPL 
3917 AAG CTG CTG CGC GGC ACC AAG GCC CTG ACC GAG GTG GTG CCC CTG 



NL4-3 gexibank.SEQ 
pNL4-3.seq 
pHDMHgpm2 . seq 

NL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgpm2 . seq 

NL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgpm2,seq 

NL4-3 genbar.k.SEQ 
pNL4-3. seq 
pEDMHgpm2 . seq 

NL4-3 genbank.SEQ 

pNL4-3.seq 

pHDMHgpra2.seq 

NL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgpm2 .seq 
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Alignment Report of Codon Optimization (potyMEG, using Clustal method with PAM250 residue weight table. 



1 

3440 



3470 



3437 TEEAELELAENREIL 
3437 ACA GAA GAA GCA GAG CTA GAA CTG GCA GAA AAC AGO GAG ATT CTA 
3435 TEEAELELAENREIL 
3435 ACA GAA GAA GCA GAG CTA GAA CTG GCA GAA AAC AGG GAG ATT CTA 
3962 TEEAEL ELAENREIL 
3962 ACC GAG GAG GCC GAG CTG GAG CTG GCC GAG AAC CGC GAG ATC CTG 



1 

3500 



3482 
3482 
3480 
3480 
4007 
4C07 



KEPVHGVYYDPSKDL 
AAA GAA CCG GTA CAT GGA GTG TAT TAT GAC CCA TCA AAA GAC TTA 

KEPVHGVYYDPSKDL 
AAA GAA CCG GTA CAT GGA GTG TAT TAT GAC CCA TCA AAA GAC TTA 

KEPVHGVYYDPSKDL 
AAG GAG CCC GTG CAC GGC GTG TAC TAC GAC CCC TCC AAG GAC CTG 



\ 

3530 



3560 



IAEIQXQGQ GQWTYQ 
ATA GCA GAA ATA CAG AAG CAG GGG CAA GGC CAA TGG ACA TAT CAA 
IAEIQKQGQGQWTYQ 
3525 ATA GCA GAA ATA CAG AAG CAG GGG CAA GGC CAA TGG ACA TAT CAA 
4052 IAEIQKQGQGQWTYQ 
4052 ATC GCC GAG ATC CAG AAG CAG GGC CAG GGC CAG TGG ACC TAC CAG 



3527 
3527 
3525 



3590 



3572 I Y Q EP FK NLKTGKYA 

3572 ATT TAT CAA GAG CCA TTT AAA AAT CTG AAA ACA GGA AAA TAT GCA 

3570 I Y Q 3 P FKNLKTGKYA 

3570 ATT TAT CAA GAG CCA TTT AAA AAT CTG AAA ACA GGA AAA TAT GCA 

4097 I YQEP FKNLKTGKYA 

4097 ATC TAC CAG GAG CCC TTC AAG AAC CTG AAG ACC GGC AAA TAC GCC 



3620 



1 

3650 



3617 RMKGAHTNDVKQLTE 

3617 AGA ATG AAG GGT GCC CAC ACT AAT GAT GTG AAA CAA TTA ACA GAG 
3615 RMKGAHTNDVKQLTE 

3615 AGA ATG AAG GGT GCC CAC ACT AAT GAT GTG AAA CAA TTA ACA GAG 
4142 RMKGAHTNDVKQLTE 

4142 CGC ATG AAG GGC GCC CAC ACC AAC GAC GTG AAG CAG CTG ACC GAG 



1 

3680 



3662 
3662 
3660 
3660 
4187 
4187 



AVQKIATBSIVIWGK 
GCA GTA CAA AAA ATA GCC ACA GAA AGC ATA GTA ATA TGG GGA AAG 

AVQKIATBSIVIWGK 
GCA GTA CAA AAA ATA GCC ACA GAA AGC ATA GTA ATA TGG GGA AAG 

AVQKIATBS IVIWGK 
GCC GTG CAG AAG ATC GCC ACC GAG TCC ATC GTG ATC TGG GGC AAG 



NL4-3 genbank.SEQ 
pNL4-3.seq__ 
pHBMHgpm2 . seq 

NL4-3 genbank.SEQ 
pNL4-3. seq 
p!£DMHgpm2.seq 

NL4-3 genbank.SEQ 
pNL4-3.seq 
pHCMHgpm2 - sec 

NL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgpm2. seq 

NL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgpmi. seq 

NL4-3 genbank.SEQ 

pNL4-3.seq 

pHDMHgpm2.seq 
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Alignment Report of Codon Optimization (pol).MEG. using ClustaJ method with PAM250 residue weight table. 

I 

3710 3740 

i . I 

3707 TPKF .KLP I QKETWEA NL4-3 genbank.SEQ 

3707 ACT CCT AAA TTT AAA TTA CCC ATA CAA AAG GAA ACA TGG GAA GCA 

3705 TPKFKLPIQKETWE A pNL4-3.seq 

3705 ACT CCT AAA TTT AAA TTA CCC ATA CAA AAG GAA ACA TGG GAA GCA 

4232 TPKFKLPIQKETWEA P HDMHgpm2 . seq 
4232 ACT CCC AAG TTC AAG CTG CCC ATC CAG AAG GAG ACC TGG GAG GCC 

3770 

— l 

3752 WWTEYWQATWIPEWE NL4-3 genbank.SEQ 

3752 TGG TGG ACA GAG TAT TGG CAA GCC ACC TGG ATT CCT GAG TGG GAG 

3750 WWTEYWQATWIPEWE pNL4-3.seq 

3750 TGG TGG ACA GAG TAT TGG CAA GCC ACC TGG ATT CCT GAG TGG GAG 

4277 WWTEYWQATWIPEWE P HDMKgpm2 . seq 

4277 TGG TGG ACC GAG TAC TGG CAG GCC ACC TGG ATC CCC GAG TGG GAG 

3800 3830 

I i . 

3797 FVMTPPI.VKIWYQLE NL4-3 genbank.SEQ 

3797 TTT GTC AAT ACC CCT CCC TTA GTG AAG TTA TGG TAC CAG TTA GAG 

3795 FVNTPPLV KLWYQLE pNL4-3.seq 

3795 TTT GTC AAT ACC CCT CCC TTA GTG AAG TTA TGG TAC CAG TTA GAG 

4322 F V N T P P L V K LW Y QL E pHDMKgpm2 . s eq 

4322 TTC GTG AAC ACC CCC CCC CTG GTG AAG CTG TGG TAC CAG CTG GAG 

i 

3860 

. L. 

3842 KEPIIGAETFYVDGA NL4-3 genbank.SEQ 

3842 AAA GAA CCC ATA ATA GGA GCA GAA ACT TTC TAT GTA GAT GGG GCA 

3840 KEPIIGAETFYVDGA pNL4-3.seq 

3840 AAA GAA CCC ATA ATA GGA GCA GAA ACT TTC TAT GTA GAT GGG GCA 

4367 KEPIIGAETFYVDGA pHDKHgpm2 . sec 

4367 AAG GAG CCC ATC ATC GGC GCC GAG ACC TTC TAC GTG GAC GGC GCC 

, — , 

3890 392C 

3887 ANRETKLGKAGYVTD NL4-3 genbank.SEQ 

3887 GCC AAT AGG GAA ACT AAA TTA GGA AAA GCA GGA TAT GTA ACT GAC 

3885 ANRETKLG KAGYVTD pNL4-3.seq 

3885 GCC AAT AGG GAA ACT AAA TTA GGA AAA GCA GGA TAT GTA ACT GAC 

4412 ANRETKLGKAGYVTD pHDMHgpm2 .seq 

4412 GCC AAC CGC GAG ACC AAG CTG GGC AAG GCC GGC TAC GTG ACC GAC 

i 

3950 

, _ —I — 

3932 RGRQXVVPLTDTTNQ NL4-2 genbank.SEQ 

3932 AGA GGA AGA CAA AAA GTT GTC CCC CTA ACG GAC ACA ACA AAT CAG 

3930 RGRQKVVP LTDTTNQ pNL4-3.seq 

3930 AGA GGA AGA CAA AAA GTT GTC CCC CTA ACG GAC ACA ACA AAT CAG 

4457 RGRQKVVP LTDTTNQ pHDMHgpntf . seq 

4457 CGC GGC CGC CAG AAG GTG GTG CCC CTG ACC GAC ACC ACC AAC CAG 
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Afignment Report of Codon Optimization (poUMEG, using Oustd method with PAM250 residue weight table. 



3980 



4010 



3977 K T E L Q A I HLALQD5G 

3977 AAG ACT GAG TTA CAA GCA ATT CAT CTA GCT TTG CAG GAT TCG GGA 

3975 KTELQAI" HLALQDSG 

3975 AAG ACT GAG TTA CAA GCA ATT CAT CTA GCT TTG CAG GAT TCG GGA 

4502 KTELQAIHLALQDSG 

4502 AAG ACC GAG CTG CAG GCC ATC CAC CTG GCC CTG CAA GAC TCC GGC 



4040 



4022 
4022 
4020 
4020 
4547 
4547 



4067 
4067 
4065 
4065 
4592 



LEVN IVTD.SQYALGI 
TTA GAA GTA AAC ATA GTG ACA GAC TCA CAA TAT GCA TTG GGA ATC 

LEVN IVTDSQYALGI 
TTA GAA GTA AAC ATA GTG ACA GAC TCA CAA TAT GCA TTG GGA ATC 

LEVNIVTDSQYALGI 
CTG GAG GTG AAC ATC GTG ACC GAC TCC CAG TAT GCA TTG GGC ATC 



1 

4070 



4100 



IQAQPDXSES2LVSQ 
ATT CAA GCA CAA CCA GAT AAG AGT GAA TCA GAG TTA GTC AGT CAA 

IQAQPDKSESELVSQ 
ATT CAA GCA CAA CCA GAT AAG AGT GAA TCA GAG TTA GTC AGT CAA 

IQACr PDKSESELVSQ 



4S92 ATC CAG GCC CAG CCC GAC AAG TCC GAG TCC GAG CTG GTG TCC CAG 

— 1 ~ 

4130 

4112 IIEQLIKKEKVYLAW 
.4112 ATA ATA GAG CAG TTA ATA AAA AAG GAA AAA GTC TAC CTG GCA TGG 
4110 IIEQLIKKEKVYLAW 
4110 ATA ATA GAG CAG TTA ATA AAA AAG GAA AAA GTC TAC CTG GCA TGG 
4637 IIEQLIKKEKVYLAW 
4637 ATC ATC GAG CAG CTG ATC AAG AAG GAG AAG GTG TAC CTG GCC TGG 



4160 



4190 



4157 VPAH KGI GGNEQVDG 

4157 GTA CCA GCA CAC AAA GGA ATT GGA GGA AAT GAA CAA GTA GAT GGG 

4155 VPAHXGIG GNEQVDK 

4155 GTA CCA GCA CAC AAA GGA ATT GGA GGA AAT GAA CAA GTA GAT AAG 

4682 VPAH KGIGGNEQVDK 

4692 GTG CCC GCC CAC AAG GGC ATC GGC GGC AAC GAG CAG GTG GAC AAG 



1 

4220 



4202 
4202 
4200 
4200 
4727 
4727 



LVSAGX RKVLFLDGI 
TTG GTC AGT GCT GGA ATC AGG AAA GTA CTA TTT TTA GAT GGA ATA 

LVSAGI RKVLFLDGI 
TTG GTC AGT GCT GGA ATC AGG AAA GTA CTA TTT TTA GAT GGA ATA 

LVSAGIRKVLFLDGI 
CTG GTG TCC GCC GGC ATC CGC AAG GTG CTG TTC CTG GAC GGC ATC 



NL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgpm27seq 

NL4-3 genbank.SEQ 
pNL4-3. seq 
pHDMHgpm2 .seq 

HL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgpm2 . seq 

NL4-3 genbank.SEQ 

pML4-3.seq 

pHDMHgpm2.seq 

HL4-3 genbank.SEQ 
pNL4-3. seq 
pHDMHgpm2.seq 

ML4-3 genbank.SEQ 

pNL4~3.seq 

pHEMHgpm2.seq 
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Alignment Report of Codon Optimization (pol).MEG, using dustal method with PAM250 residue weight table. 

i | 

4250 4280 

' t 

4247 DKAQEEHEKYHSNWR NL4-3 genbank.SEQ 

4247 GAT AAG GCC CAA GAA GAA CAT GAG AAA TAT CAC ACT AAT TGG AGA 

4245 DKAQEEHEKYHSNWR pHL4-3.seq 

4245 GAT AAG GCC CAA GAA GAA CAT GAG AAA TAT CAC AGT AAT TGG AGA 

4772 DKAQEEHEKYHSNWR pHDMHgpm2 . seq 

4772 GAC AAG GCC CAG GAG GAG CAC GAG AAG TAC CAC TCC AAC TGG CGC 

i 

4310 

' 

4292 A M A S DFNLF PVVAKE NL4-3 genbank.SEQ 

4292 GCA ATG GCT AGT GAT TTT AAC CTA CCA CCT GTA GTA GCA AAA GAA 

4290 AMASDFNLP PVVAKE pNL4-3. seq 

4290 GCA ATG GCT AGT GAT TTT AAC CTA CCA CCT GTA GTA GCA AAA GAA 

4817 AMASDFNLPPVVAKE pHDMHgpm2 . seq 

4817 GCC ATG GCC TCC GAC TTC AAC CTG CCC CCC GTG GTG GCC AAG GAG 

» ii . i i 

4340 4370 

I l mmmm 

4337 IVASCDKCQLKGEAM NL4-3 genbank.SEQ 

4337 ATA GTA GCC AGC TGT GAT AAA TGT CAG CTA AAA GGG GAA GCC ATG 

4335 IVASCDKCQLKGEAM pNL4-3.seq 

4335 ATA GTA GCC AGC TGT GAT AAA TGT CAG CTA AAA GGG GAA GCC ATG 

4862 IVASCDKCQLKGEAM pHDMHgpm2 . seq 

4862 ATC GTG GCC TCC TGC GAC AAG TGC CAG CTG AAG GGC GAG GCC ATG 

4400 

I 

4382 HGQVDCSPGIWQLDC NL4-3 genbank.SEQ 

4382 CAT GGA CAA GTA GAC TGT AGC CCA GGA ATA TGG CAG CTA GAT TGT 

4380 HGQVDCSPGIWQLDC pNL4-3. sec 

4380 CAT GGA CAA GTA GAC TGT AGC CCA GGA ATA TGG CAG CTA GAT TGT 

4907 HGQVDCSPGIWQLDC pHDMHgpm2 . seq 

4907 CAC GGC CAG GTG GAC TGC TCC CCC GGC ATC TGG CAG CTG GAC TGC 

1 ■ 1 

4430 4460 

I 1 

4427 THL EGKVI LVAVHVA NL4-3 genbank.SEQ 

4427 AGA CAT TTA GAA GGA AAA GTT ATC TTG GTA GCA GTT CAT GTA GCC 

4425 THLEGKVI LVAVHVA pNL4-3.seq 

4425 ACA CAT TTA GAA GGA AAA GTT ATC TTG GTA GCA GTT CAT GTA GCC 

4952 THL EGKVI LVAVHVA pHDMHgpm2 . seq 

4952 ACC CAC CTG GAG GGC AAG GTG ATC CTG GTG GCC GTG CAC GTG GCC 



4490 
i 

4472 SGYIEAEVIPAETGQ NL4-3 genbank.SEQ 

4472 AGT GGA TAT ATA GAA GCA GAA GTA ATT CCA GCA GAG ACA GGG CAA 

4470 SGYIEAEVIPAETGQ pNL4-3.seq 

4470 AGT GGA TAT ATA GAA GCA GAA GTA ATT CCA GCA GAG ACA GGG CAA 

4997 SGYIEAEVIPAETGQ pHDMHgpm2 . seq 

4997 TCC GGC TAC ATC GAG GCC GAG GTG ATC CCC GCC GAG ACC GGC CAG 
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Alignment Report of Codon Optimization (pd). MEG, using Ctushrf method with PAM250 resduo weight table. 



r 

4520 



I 

4550 



4517 ETAYFLLKLAGRWPV 

4517 GAA ACA GCA TAC TTC CTC TTA AAA TTA GCA GGA AGA TGG CCA GTA 

4515 ETAYFLLKLAGRWPV 

4515 GAA ACA GCA TAC TTC CTC TTA AAA TTA GCA GGA AGA TGG CCA GTA 

S042 E T A Y F L L KLAGRWPV pHDMHgpm2 . seq 

5042 GAG ACC GCC TAC TTC CTG CTG AAG CTG GCC GGC CGC TGG CCC GTG 



NL4-3 genbank.SEQ 
pNL4-3.seq 



4580 
_i 



4562 KTVHTDNGSNFTSTT 

4562 AAA ACA GTA CAT ACA GAC AAT GGC AGC AAT TTC ACC AGT ACT ACA 

4560 KTVHTDNGSNFTSTT 

4560 AAA ACA GTA CAT ACA GAC AAT GGC AGC AAT TTC ACC AGT ACT ACA 

5067 KTVHTDN GSNFTSTT 

5087 AAG ACC GTG CAC ACC GAC AAC GGC TCC AAC TTC ACC TCC ACC ACC 



4610 



4640 



4607 
4 607 
4 605 
4605 
5132 
5132 



VKAACWWAGIKQEFG 
GTT AAG GCC GCC TGT TGG TGG GCG GGG ATC AAG CAG GAA TTT GGC 

VKAACWWAGIKQEFG 
GTT AAG GCC GCC TGT TGG TGG GCG GGG ATC AAG CAG GAA TTT GGC 

VKAACWWAGIKQEFG 
GTG AAG GCC GCC TGC TGG TGG GCC GGC ATC AAG CAG . GAG TTC GGC 



4670 



4652 IFYNPQSQGVIESMK 

4652 ATT CCC TAC AAT CCC CAA AGT CAA GGA GTA ATA GAA TCT ATG AAT 

4650 IPYNPQSQGVIESMN 

4650 ATT CCC TAC AAT CCC CAA AGT CAA GGA GTA ATA GAA TCT ATG AAT 

5177 IPYNPQSQG VIESMN 

5177 ATC CCC TAC AAC CCC CAG TCC CAG GGC GTG ATC GAG TCC ATG AAC 



1 

4700 



4730 



KELKKIIGQVRDQAE 
AAA GAA TTA AAG AAA ATT ATA GGA CAG GTA AGA GAT CAG GCT GAA 

KELKKIIGQVRDQAE 
AAA GAA TTA AAG AAA ATT ATA GGA CAG GTA AGA GAT CAG GCT GAA 
KELKKIIGQVRDQAE 
5222 AAG GAG CTG AAG AAG ATC ATC GGC CAA GTC CGC GAC CAG GCC GAG 



4697 
4697 
4695 
4695 
5222 



4760 



HLKTAVQMAVFIHNF 
CAT CTT AAG ACA GCA GTA CAA ATG GCA GTA TTC ATC CAC AAT TTT 

HLKTAVQMAVFIHNF 
CAT CTT AAG ACA GCA GTA CAA ATG GCA GTA TTC ATC CAC AAT TTT 
HLKTAVQMAVFIHNF 
5267 CAC CTG AAG ACC GCC GTG CAG ATG GCC GTG TTC ATC CAC AAC TTC 



4742 
4742 
4740 
4740 
5267 



NL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgpm2 . seq 

NL4-3 genbank.SEQ 

pNL4-3.seq 

pHDMHgpm2.seq 

NL4-3 genbank.SEQ 
pNL4-3.seq 
pRDMHgpm2 . seq 

NL4-3 genbank.SEQ 
pHL4-3.seq 
pHDMHgpm2. seq 

NL4-3 genbank.SEQ 
pNL4-3. seq 
pHDMHgpm2 . seq 
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ABgnmert! Report of Codon OpSmizatfon (pol).MEG, using OustaJ method with PAM250 residue weight table. 



4787 
4787 
4785 
4785 
5312 
5312 



I 

4790 



4820 
I 



KRKGGIGGYSAG2 R J~~ 
AAA AGA AAA GGG GGG ATT GGG GGG TAG ACT GCA GGG GAA AGA ATA 

KRKGG. IGGYSAGERI 
AAA AGA AAA GGG GGG ATT GGG GGG TAC AGT GCA GGG GAA AGA ATA 

KRKGGIGGYSA GERI 
AAG CGC AAG GGC GGC ATC GGC GGC TAC TCC GCC GGC GAG CGC ATC 



NL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgpm2. seq 



4850 



4832 
4832 
4830 
4830 
5357 
5357 



4877 
4877 
4875 
4875 
5402 
5402 



4922 
4922 
4920 
4920 
5447 
5447 



4967 
4967 
496S 
496S 
5492 
5492 



VOI IATDIQTKELQK NL4-3 genbank.SEQ 
GTA GAC ATA ATA GCA ACA GAC ATA CAA ACT AAA GAA TTA CAA AAA 

V D I IATDIQTKELQK pNL4-3.seq 
GTA GAC ATA ATA GCA ACA GAC ATA CAA ACT AAA GAA TTA CAA AAA 

V D I IATDIQTKELQK pHDMHgpm2 . s ec 
GTG GAC ATC ATC GCC ACC GAC ATC CAG ACC AAG GAG CTG CAG AAG 



1 

4880 



4910 

L_ 



QITXIQWJRVYYRDS 
CAA ATT ACA AAA ATT CAA AAT TTT CGG GTT TAT TAC AGG GAC AGC 

QITKIQWFRVYYRDS 
CAA ATT ACA AAA ATT CAA AAT TTT CGG GTT TAT TAC AGG GAC AGC 

QITKIQNFRVYYRDS 
CAG ATC ACC AAG ATC CAG AAC TTC CGC GTG TAC TAC CGC GAC TCC 



4940 



ft D PVWKGPAKLLWKG 
AGA GAT CCA GTT TGG AAA GGA CCA GCA AAG CTC CTC TGG AAA GGT 

RDPVWKGPAXLLWKG 
AGA GAT CCA GTT TGG AAA GGA CCA GCA AAG CTC CTC TGG AAA GGT 

RD PVWXGPAKLL WKG 
CGC GAC CCC GTG TGG AAG GGC CCC GCC AAG CTG CTG TGG AAG GGC 



NL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgpm2 .seq 

NL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgpm2 . seq 



1 

4970 



5000 



EGAVVIQDNSDIXVV KL4-3 genbank.SEQ 
GAA GGG GCA GTA GTA ATA CAA GAT AAT AGT GAC ATA AAA GTA GTG 

EGAVVIQDNSDIKVV 
GAA GGG GCA GTA GTA ATA CAA GAT AAT AGT GAC ATA AAA GTA GTG 

EGAVVIQDNSDIKVV 
GAG GGC GCC GTG GTG ATC CAG GAC AAC TCC GAC ATC AAG GTG GTG 



pNL4-3.seq 
pHDMBgpm2 . seq 



5012 
5012 
5010 
5010 
5537 
5537 



5030 
i 



PRRKAKIIRDYGKQ.M 
CCA AGA AGA AAA GCA AAG ATC ATC AGG GAT TAT GGA AAA CAG ATG 

PRR KAKIIRDYGKQM 
CCA AGA AGA AAA GCA AAG ATC ATC AGG GAT TAT GGA AAA CAG ATG 

PRRKAKI IRDYGKQM 
CCC CGC CGC AAG GCC AAG ATC ATC CGC GAC TAC GGC AAG CAG ATG 



Nt4-3 genbank.SEQ 
pHI.4-3.seq 
pHDMHgpm2 . seq 
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ABgnment Report of Codon Optimization (pel). MEG, using Oustat method wilh PAM250 rosidua weight taUe. 

I 1 

5060 5090 
l I 
5057 AGDDCVASRQDBD NL4-3 genbaXJk.SEQ 

5057 GCA GGT GAT GAT TGT GTG GCA AGT AGA CAG GAT GAG GAT TAA 
5055 AGD DCVASRQDBD ^yj[ pKL4-3.aeq 
S055 GCA GGT GAT GAT TGT GTG GCA AGT AGA CAG GAT GAG GAT TAA* 

55B2 AGDDCVASRQDBD pHDMHgpm2 . aeq 

5582 GCC GGC GAC GAC TGC GTG GCC TCC CGC CAG GAC GAG GAC TAA' 
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AGCTTGGCCC ATTGCATACG TTGTATCCAT ATCATAATAT GTACATTTAT ATTGGCTCAT 60 

GTCCAACATT ACCGCCATGT TGACATTGAT TATTGACTAG TTATTAATAG TAATCAATTA 120 

CGGGGTCATT AGTTCATAGC CCATATATGG AGTTCCGCGT TACATAACTT ACGGTAAATG 180 

GCCCGCCTGG CTGACCGCCC AACGACCCCC GCCCATTGAC GTCAATAATG ACGTATGTTC 240 

CCATAGTAAC GCCAATAGGG ACTTTCCATT GACGTCAATG GGTGGAGTAT TTACGGTAAA 300 

CTGCCCACTT GGCAGTACAT CAAGTGTATC ATATGCCAAG TACGCCCCCT ATTGACGTCA 36Q_ 

ATGACGGTAA ATGGCCCGCC TGGCATTATG CCCAGTACAT GACCTTATGG GACTTTCCTA 420 

CTTGGCAGTA CATCTACGTA TTAGTCATCG CTATTACCAT GGTGATGCGG TTTTGGCAGT 480 

ACATCAATGG GCGTGGATAG CGGTTTGACT CACGGGGATT TCCAAGTCTC CACCCCATTG 540 

ACGTCAATGG GAGTTTGTTT TGGCACCAAA ATCAACGGGA CTTTCCAAAA TGTCGTAACA 600 

ACTCCGCCCC ATTGACGCAA ATGGGCGGTA GGCGTGTACG GTGGGAGGTC TATATAAGCA 660 

GAGCTCGTTT AGTGAACCGT CAGATCGCCT GGAGACGCCA TCCACGCTGT TTTGACCTCC 720 

ATAGAAGACA CCGGGACCGA TCCAGCCTCC CCTCGAAGCT GATCCTGAGA ACTTCAGGGT 780 

GAGTCTA7GG GACCCTTGAT GTTTTCTTTC CCCTTCTTTT CTATGGTTAA GTTCATGTCA 840 

TAGGAAGGGG AGAAGTAACA GGGTACACAT ATTGACCAAA TCAGGGTAAT TTTGCATTTG 900 

TAATTTTAAA AAATGCTTTC TTCTTTTAAT ATACTTTTTT GTTTATCTTA TTTCTAATAC 960 

TTTCCCTAAT CTCTTTCTTT CAGGGCAATA ATGATACAAT GTATCATGCC TCTTTGCACC 1020 

ATTCTAAAGA ATAACAGTGA TAATTTCTGG GTTAAGGCAA TAGCAATATT TCTGCATATA 1080 

AATATTTCTG CATATAAATT GTAACTGATG TAAGAGGTTT CATATTGCTA ATAGCAGCTA 1140 

CAATCCAGCT ACCATTCTGC TTTTATTTTA TGGTTGGGAT AAGGCTGGAT TATTCTGAGT 1200 

CCAAGCTAGG CCCTTTTGCT AATCATGTTC ATACCTCTTA TCTTCCTCCC ACAGCTCCTG 1260 

GGCAACGTGC TGGTCTGTGT GCTGGCCCAT CACTTTGGCA AAGAATTCTA GACTGCCATG 1320 

GGCGCCCGCG CCTCCGTGCT GTCCGGCGGC GAGCTGGACA AGTGGGAGAA GATCCGCCTG 1380 

CGCCCCGGCG GCAAGAAGCA GTACAAGCTG AAGCACATCG TGTGGGCCTC CCGCGAGCTG 1440 

GAGCGCrTCG CCGTGAACCC CGGCCTGCTG GAGACCTCCG AGGGCTGCCG CCAGATCCTG 1500 

GGCCAGCTGC AGCCCTCCCT GCAAACCGGC TCCGAGGAGC TGCGCTCCCT GTACAACACC 1560 

ATCGCCGTGC TGTACTGCGT GCACCAGCGC ATCGACGTGA AGGACACCAA GGAGGCCCTG 1620 

GACAAGATCG AGGAGGAGCA GAACAAGTCC AAGAAGAAGG CCCAGCAGGC CGCCGCCGAC 1680 

ACCGGCAACA ACTCCCAGGT GTCCCAGAAC TACCCCATCG TGCAGAACCT GCAGGGCCAG 1740 

ATGGTGCACC AGGCCATCTC CCCCCGCACC CTGAACGCCT GGGTGAAGGT GGTGGAGGAG 1800 

AAGGCCTTCT CCCCCGAAGT CATCCCCATG TTCTCCGCCC TGTCCGAGGG CGCCACCCCC 1860 

CAGGACCTGA ACACCATGCT GAACACCGTG GGCGGCCACC AGGCCGCCAT GCAGATGCTG 1920 

AAGGAGACCA TCAACGAGGA GGCCGCCGAG TGGGACCGCC TGCACCCCGT GCACGCCGGC 1980 

CCCATCGCCC CCGGCCAGAT GCGCGAGCCC CGCGGCTCCG ACATCGCCGG CACCACCTCC 2040 

ACCCTGCAAG AGCAGATCGG CTGGATGACC CACAACCCCC CCATCCCCGT GGGCGAGATC 2100 

TACAAGCGCT GGATCATCCT GGGCCTGAAC AAGATCGTGC GCATGTACTC CCCCACCTCC 2160 

ATCCTGGACA TCCGCCAGGG CCCCAAGGAG CCCTTCCGCG ACTACGTGGA CCGCTTCTAC 2220 

AAGACCCTGC GCGCCGAGCA GGCCTCCCAG GAGGTAAAGA ACTGGATGAC CGAGACCCTG 2280 

CTGGTGCAGA ACGCCAACCC CGACTGCAAG ACCATCCTGA AGGCCCTGGG CCCCGGCGCC 2340 

ACCCTGGAGG AGATGATGAC CGCCTGCCAG GGCGTGGGCG GCCCCGGCCA CAAGGCCCGC 2400 

GTGCTGGCCG AGGCCATGTC CCAAGTCACC AACCCCGCCA CCATCATGAT CCAGAAGGGC 2460 

AACTTCCGCA ACCAGCGCAA GACCGTGAAG TGCTTCAACT GCGGCAAGGA GGGCCACATC 2520 

GCCAAGAACT GCCGCGCCCC CCGCAAGAAG GGCTGCTGGA AGTGCGGCAA GGAGGGCCAC 2580 

CAGATGAAAG ATTGTACTGA GAGACAGGCT AATTTTTTAG GGAAGATCTG GCCTTCCCAC 2640 

AAGGGAAGGC CAGGGAATTT TCTTCAGAGC AGACCAGAGC CAACAGCCCC ACCAGAAGAG 2700 

AGCTTCAGGT TTGGGGAAGA GACAACAACT CCCTCTCAGA AGCAGGAGCC GATAGACAAG 2760 

GAACTGTATC CTTTAGCTTC CCTCAGATCA CTCTTTGGCA GCGACCCCTC GTCACAATAA 2820 
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AGATCGGTGG CCAGCTGAAG GAGGCCCTGC TGGACACCGG CGCCGACGAC ACCGTGCTGG 2B80 

AGGAGATGAA CCTGCCCGGC CGCTGGAAGC CCAAGA7GAT CGGCGGCATC GGCGGCTTCA 2940 

TCAAAGTCCG CCAGTACGAC CAGATCCTGA TCGAGATCTG CGGCCACAAG GCCATCGGCA 3000 

CCGTGCTGGT GGGCCCCACC CCCGTGAACA TCATCGGCCG CAACCTGCTG ACCCAGATCG 3060 

GCTGCACCCT GAACTTCCCC ATCTCCCCCA TCGAGACCGT GCCCGTGAAG CTGAAGCCCG 3120 

GCATGGACGG CCCCAAAGTC AAGCAGTGGC CCCTGACCGA GGAGAAGATC AAGGCCCTGG 3180 

TGGAGATCTG CACCGAGATG GAGAAGGAGG GCAAGATCTC CAAGATCGGC CCCGAGAACC 3240" 

CCTACAACAC CCCCGTGTTC GCCATCAAGA AGAAGGACTC CACCAAGTGG CGCAAGCTGG 3300 

TGGACTTCCG CGAGCTGAAC AAGCGCACCC AGGACTTCTG GGAGGTGCAG CTGGGCATCC 3360 

CCCACCCCGC CGGC CTGAAG CAGAAGAAGT CCGTGACCGT GCTGGACGTG GGCGACGCCT 3420 

ACTTCTCCGT GCCCCTGGAC AAGGACTTCC GCAAGTACAC CGCCTTCACC ATCCCCTCCA 3480 

TCAACAACGA GACCCCCGGC ATCCGCTACC AGTACAACGT GCTGCCCCAG GGCTGGAAGG 3540 

GCTCCCCCGC CATCTTCCAG TGCTCCATGA CCAAGATCCT GGAGCCCTTC CGCAAGCAGA 3600 

ACCCCGACAT CGTGATC TAC CAGTACATGG ACGACCTGTA CGTGGGCTCC GACCTGGAGA 3660 

TCGGCCAGCA CCGCACCAAG ATCGAGGAGC TGCGCCAGCA CCTGCTGCGC TGGGGCTTCA 3720 

CCACCCCCGA CAAGAAGCAC CAGAAGGAGC CCCCCTTCCT GTGGATGGGC TACGAGCTGC 3780 

ACCCCGACAA GTGGACCGTG CAGCCCATCG TGCTGCCCGA GAAGGACTCC TGGACCGTGA 3840 

ACGACATCCA GAAGCTGGTG GGCAAGCTGA ACTGGGCCTC CCAGATCTAC GCCGGCATCA 3900 

AAGTCCGCCA GCTGTGCAAG CTGCTGCGCG GCACCAAGGC CCTGACCGAG GTGGTGCCCC 3960 

TGACCGAGGA GGCCGAGCTG GAGCTGGCCG AGAACCGCGA GATCCTGAAG GAGCCCGTGC 4020 

ACGGCGTGTA CTACGACCCC TCCAAGGACC TGATCGCCGA GATCCAGAAG CAGGGCCAGG 4080 

GCCAGTGGAC CTACCAGATC TACCAGGAGC CCTTCAAGAA CCTGAAGACC GGCAAATACG 4140 

CCCGCATGAA GGGCGCCCAC ACCAACGACG TGAAGCAGCT GACCGAGGCC GTGCAGAAGA 4200 

TCGCCACCGA GTCCATCGTG ATCTGGGGCA AGACTCCCAA GTTCAAGCTG CCCATCCAGA 4260 

AGGAGACCTG GGAGGCCTGG TGGACCGAGT ACTGGCAGGC CACCTGGATC CCCGAGTGGG 4320 

AGTTCGTGAA CACCCCCCCC CTGGTGAAGC TGTGGTACCA GCTGGAGAAG GAGCCCATCA 4380 

TCGGCGCCGA GACCTTCTAC GTGGACGGCG CCGCCAACCG CGAGACCAAG CTGGGCAAGG 4440 

CCGGCTACGT GACCGACCGC GGCCGCCAGA AGGTGGTGCC CCTGACCGAC ACCACCAACC 4500 

AGAAGACCGA GCTGCAGGCC ATCCACCTGG CCCTGCAAGA CTCCGGCCTG GAGGTGAACA 4560 

TCGTGACCGA CTCCCAGTAT GCATTGGGCA TCATCCAGGC CCAGCCCGAC AAGTCCGAGT 4620 

CCGAGCTGGT GTCCCAGATC ATCGAGCAGC TGATCAAGAA GGAGAAGGTG TACCTGGCCT 4680 

GGGTGCCCGC CCACAAGGGC ATCGGCGGCA ACGAGCAGGT GGACAAGCTG GTGTCCGCCG 4740 

GCATCCGCAA GGTGCTGTTC CTGGACGGCA TCGACAAGGC CCAGGAGGAG CACGAGAAGT 4800 

ACCACTCCAA CTGGCGCGCC ATGGCCTCCG ACTTCAACCT GCCCCCCGTG GTGGCCAAGG 4860 

AGATCGTGGC CTCCTGC GAC AAGTGCCA6C TGAAGGGCGA GGCCATGCAC GGCCAGGTGG 4920 

ACTGCTCCCC CGGCATCTGG CAGCTGGACT GCACCCACCT GGAGGGCAAG GTGATCCTGG 4980 

TGGCCGTGCA CGTGGCCTCC GGCTACATCG AGGCCGAGGT GATCCCCGCC GAGACCGGCC 5040 

AGGAGACCGC CTACTTCCTG CTGAAGCTGG CCGGCCGCTG GCCCGTGAAG ACCGTGCACA 5100 

CCGACAACGG CTCCAACTTC ACCTCCACCA CCGTGAAGGC CGCCTGCTGG TGGGCCGGCA 5160 

TCAAGCAGGA GTTCGGCATC CCCTACAACC CCCAGTCCCA GGGCGTGATC GAGTCCATGA 5220 

ACAAGGAGCT GAAGAAGATC ATCGGCCAAG TCCGCGACCA GGCCGAGCAC CTGAAGACCG 5280 

CCGTGCAGAT GGCCGTGTTC ATCCACAACT TCAAGCGCAA GGGCGGCATC GGCGGCTACT 5340 

CCGCCGGCGA GCGCATCGTG GACATCATCG CCACCGACAT CGAGACCAAG GAGCTGCAGA 5400 

AGCAGATCAC CAAGATCCAG AACTTCCGCG TGTACTACCG CGACTCCCGC GACCCCGTGT 5460 

GGAAGGGCCC CGCCAAGCTG CTGTGGAAGG GCGAGGGCGC CGTGGTGATC CAGGACAACT 552 0 

CCGACATCAA GGTGGTGCCC CGCCGCAAGG CCAAGATCAT CCGCGACTAC GGCAAGCAGA 5580 

TGGCCGGCGA CGACTGCGTG GCCTCCCGCC AGGACGAGGA CTAACACATG GAAAAGATTA 5640 
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GTAAAACACC ATAGGCCGCT CTAGAGGATC CAAGCTTATC GATACCGTCG ACCTCGAGGG 5700 

CCCAGATCTA ATTCACCCCA CCAGTGCAGG CTGCCTATCA GAAAGTGGTG GCTGGTGTGG 5760 

CTAATGCCCT GGCCCACAAG TATCACTAAG CTCGCTTTCT TGCTGTCCAA TTTCTATTAA 5820 

AGGTTCCTTT GTTCCCTAAG TCCAACTACT AAACTGGGGG ATATTATGAA GGGCCTTGAG 5880 

CATCTGGATT CTGCCTAATA AAAAACATTT ATTTTCATTG CAATGATGTA TTTAAATTAT 5940 

TTCTGAATAT TTTACTAAAA AGGGAATGTG GGAGGTCAGT GCATTTAAAA CATAAAGAAA 60 00 

TGAAGAGCTA GTTCAAACCT TGGGAAAATA CACTATATCT TAAACTCCAT GAAAGAAGGT 6060 — 

GAGGCTGCAA ACAGCTAATG CACATTGGCA ACAGCCCCTG ATGCCTATGC CTTATTCATC 6120 

CCTCAGAAAA GGATTCAAGT AGAGGCTTGA TTTGGAGGTT AAAGTTTTGC TATGCTGTAT 6180 

TTTACATTAC TTATTGTTTT AGCTGTCCTC ATGAATGTCT TTTCACTACC CATTTGCTTA 6240 

TCCTGCATCT CTCAGCCTTG ACTCCACTCA GTTCTCTTGC TTAGAGATAC CACCTTTCCC 6300 

CTGAAGTGTT CCTTCCATGT TTTACGGCGA GATGGTTTCT CCTCGCCTGG CCACTCAGCC 6360 

TTAGTTGTCT CTGTTGTCTT ATAGAGGTCT ACTTGAAGAA GGAAAAACAG GGGGCATGGT 6420 

TTGACTGTCC TGTGAGCCCT TCTTCCCTGC CTCCCCCACT CACAGTGACC CGGAATCCCT 6480 

CGACATGGCA GTCTAGATCA TTCTTGAAGA CGAAAGGGCC TCGTGATACG CCTATTTTTA 6540 

TAGGTTAATG TCATGATAAT AATGGTTTCT TAGACGTCAG GTGGCACTTT TCGGGGAAAT 6600 

GTGCGCGGAA CCCCTATTTG TTTATTTTTC TAAATACATT CAAATATGTA TCCGCTCATG 6660 

AGACAATAAC CCTGATAAAT GCTTCAATAA TATTGAAAAA GGAAGAGTA7 GAGTATTCAA 6720 

CATTTCCGTG TCGCCCTTAT TCCCTTTTTT GCGGCATTTT GCCTTCCTGT TTTTGCTCAC 6780 

CCAGAAACGC TGGTGAAAGT AAAAGATGCT GAAGATCAGT TGGGTGCACG AGTGGGTTAC 6840 

ATCGAACTGG ATCTCAACAG CGGTAAGATC CTTGAGAGTT TTCGCCCCGA AGAACGTTTT 6900 

CCAATGATGA GCACTTTTAA AGTTCTGCTA TGTGGCGCGG TATTATCCCG TATTGACGCC 6960 

GGGCAAGAGC AACTCGGTCG CCGCATACAC TATTCTCAGA ATGACTTGGT TGAGTACTCA 7020 

CCAGTCACAG AAAAGCATCT TACGGATGGC ATGACAGTAA GAGAATTATG CAGTGCTGCC 7080 

ATAACCATGA GTGATAACAC TGCGGCCAAC TTACTTCTGA CAACGATCGG AGGACCGAAG 7140 

GAGCTAACCG C TTTTTTGCA CAACATGGGG GATCATGTAA CTCGCCTTGA TCGTTGGGAA 7200 

CCGGAGCTGA ATGAAGCCAT ACCAAACGAC GAGCGTGACA CCACGATGCC TGTAGCAATG 7260 

GCAACAACGT TGCGCAAACT ATTAACTGGC GAACTACTTA CTCTAGCTTC CCGGCAACAA 7320 

TTAATAGACT GGATGGAGGC GGATAAAGTT GCAGGACCAC TTCTGCGCTC GGCCCTTCCG 7380 

GCTGGCTGGT TTATTGCTGA TAAATCTGGA GCCGGTGAGC GTGGGTCTCG CGGTATCATT 7440 

GCAGCACTGG GGCCAGATGG TAAGCCCTCC CGTATCGTAG TTATCTACAC GACGGGGAGT 7500 

CAGGCAACTA TGGATGAACG AAATAGACAG ATCGCTGAGA TAGGTGCCTC ACTGATTAAG 7560 

CATTGGTAAC TGTCAGACCA AGTTTACTCA TATATACTTT AGATTGATTT AAAACTTCAT 7620 

TTTTAATTTA AAAGGATCTA GGTGAAGATC CTTTTTGATA ATCTCATGAC CAAAATCCCT 7680 

TAACGTGAGT TTTCGTTCCA CTGAGCGTCA GACCCCGTAG AAAAGATCAA AGGATCTTCT 7740 

TGAGATCCTT TTTTTCTGCG CGTAATCTGC TGCTTGCAAA CAAAAAAACC ACCGCTACCA 7800 

GCGGTGGTTT GTTTGCCGGA TCAAGAGCTA CCAACTCTTT TTCCGAAGGT AACTGGCTTC 7860 

AGCAGAGCGC AGATACCAAA TACTGTTCTT CTAGTGTAGC CGTAGTTAGG CCACCACTTC 7920 

AAGAACTCTG TAGCACCGCC TACATACCTC GCTCTGCTAA TCCTGTTACC AGTGGCTGCT 7980 

GCCAGTGGCG ATAAGTCGTG TCTTACCGGG TTGGACTCAA GACGATAGTT ACCGGATAAG 8040 

GCGCAGCGGT CGGGCTGAAC GGGGGGTTCG TGCACACAGC CCAGCTTGGA GCGAACGACC 8100 

TACACCGAAC TGAGATACCT ACAGCGTGAG CTATGAGAAA GCGCCACGCT TCCCGAAGGG 8160 

AGAAAGGCGG ACAGGTATCC GGTAAGCGGC AGGGTCGGAA CAGGAGAGCG CACGAGGGAG 8220 

CTTCCAGGGG GAAACGCCTG GTATCTTTAT AGTCCTGTCG GGTTTCGCCA CCTCTGACTT 8280 

GAGCGTCGAT TTTTGTGATG CTCGTCAGGG GGGCGGAGCC TATGGAAAAA CGCCAGCAAC 8340 

GGATGCGCCG CGTGCGGCTG CTGGAGATGG CGGACGCGAT GGATATGTTC TGCCAAGGGT 8400 

TGGTTTGCGC ATTCACAGTT CTCCGCAAGA ATTGATTGGC TCCAATTCTT GGAGTGGTGA 8460 
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ATCCGTTAGC GAGGTGCCGC CGGCTTCCAT TCAGGTCGAG 6TGGCCCGGC TCCATGCACC 8520 

GCGACGCAAC GCGGGGAGGC AGACAAGGTA TAGGGCGGCG CCTACAATCC ATGCCAACCC 8580 

GTTCCATGTG CTCGCCGAGG CGGCATAAAT CCCCGTGACG ATCAGCGGTC CAATGATCGA 8640 

AGTTAGGCTG GTAAGAGCCG CGAGCGATCC TTGAAGCTGT CCCTGATGGT CGTCATCTAC 8700 

CTGCCTGGAC AGCATGGCCT GCAACGCGGG CATCCCGATG CCGCCGGAAG CGAGAAGAAT 8760 

CATAATGGGG AAGGCCATCC AGCCTCGCGT CGGGGAGCTT TTTGCAAAAG CCTAGGCCTC 8820 

CAAAAAAGCC TCCTCACTAC TTCTGGAATA GCTCAGAGGC CGAGGCGGCC TCGGCCTCTG 8880_ 
CATAAATAAA AAAAATTAGT CAGCCATG 890 B 



Fig. 10D 



WO 00/15819 



29/29 



PCT/US99/20675 




Fig. 11 



INTERNATIONAL SEARCH REPORT 



I national Application No 

PCT/US 99/20675 



A. CLASSIFICATION OF SUBJECT MATTER 

IPC 7 C12N15/86 C12N5/10 C12N7/04 C12N15/49 C07K14/16 


According to Irxemahonal Patent Classification (IPC) or to bom national class ift 


nation and IPC 




0. FIELDS SEARCHED 


Minimum documentation searched (cteasiicatton system follow od by ctasstfco 

IPC 7 C12N C07K 


tton symbols) 




Documents 


tton searched other than minimurn decumontation to the extent that 


such documents are included In me fields s 


•arched — 


Electronic data base consulted during the international search (noma ol data basa and. where practical, saarcn terms used) 


C. DOCUMENTS CONSIDERED TO BE RELEVANT 


Category * 


Citation of document, with Indication, where appropriate, of the relevant passages 


Relevant to datm No. 


A 


NALDINI L ET AL: "IN VIVO GENE DELIVERY 
. AND STABLE TRANSDUCTION OF NON01VIDING 
CELLS BY A LENTIVIRAL VECTOR" 
SCIENCE, US, AMERICAN ASSOCIATION FOR THE 
ADVANCEMENT OF SCIENCE, , 
vol. 272, no. 5259, 

12 April 1996 ( 1996-04-12), pages 263-267, 
XP0O0583652 

ISSN: 0036-8075 
cited in the application 
the whole document 


1-4 






-/- 




| X| Further documents are listed In the continuation of box C. 


| | Parent family members are listed 


In annex 


* Special categories of cSod documents : 

"A* document defining the general state ot The art which ts not 

considered to be of particular relevance 
"E- eanier<tocuTmbi*putrtane*onoraJterlrte International 

Ding date 

V documem which may throw doubts on priority ctafofs) or 
wnich is cited to establish mo pubficatton data ol another 
citation or other special reason (aa a pec* tod) 

■O* document referring to an oral dJsctosuro, use, exhtxttonor 
Other moans 

"P" document pitrfished prior to the imemetional (Kngdatsbut 
later than the priority date claimed 


T later document published after the international tUng data 
or priority date and not k> corrtct wUh the application but 
clod to understand me principle or theory underlying the 
invention 

"X" document of particular relevance; the claimed invent ton 
cannot be considered novel or cannot be considered to 
involve an Inventive step when the document is taken atone 

cannot be considered to involve an inventive step when the 
document is combined with one or more other such docu- 
ments, such combination being obvious to a person stoned 
in the art. 

•A* document member of the same patent famOy 


Data ot mo actual completion of tha rtemationaJ search 


Date of mating of the international sea 




2b February 2000 


03/03/2000 




Name and irteling addraas of the ISA 

European Patent Office, P.& 5818 Patarttaan 2 
NL>2260HVRQsw9i 
Tei (+31-70) 340-2040, Tx. 31 SSI epo it. 
Fate (431-70) 340-3016 


Authorized officer 

Chambonnet, F 



Form PCT/tSA/21 0 (**ccnd rfweo {JWy 1 894) 



page of 2 



INTERNATIONAL SEARCH REPORT 



PCT/US 99/20675 



^Continuation) DOCUMENTS CONSIDERED TO BE RELEVANT 



Category 



Citation cf document, win indication. where appropriate, of the 



re lev ant passages 



HASELHORST 0 ET AL: "STABLE PACKAGING 
CELL LINES AND HIV-1 BASED RETROVIRAL 
VECTOR SYSTEMS" 

GENE THERAPY , GB , MACM I LLAN PRESS LTD., 

BASINGSTOKE, 

vol. 1, no. SUPPL. 02, 

18 November 1994 (1994-11-18), page S14 

XP002063698 

ISSN: 0969-7128 
the whole document 

ST LOUIS D ET AL: "CONSTRUCTION AND 

CHARACTERIZATION OF HIV-1 RETROVIRAL 

VECTORS AND REPLICATION-DEFECTIVE HIV-1 

PACKAGING CELL LINES" 

INTERNATIONAL CONFERENCE ON AIDS AND THE 

STD WORLD CONGRESS, XX, XX, 

1 June 1993 (1993-06-01), page 244 

XP002063695 

the whole document 

CARROLL R ET AL: "A HUMAN 
IMMUNODEFICIENCY VIRUS TYPE 1 
(HIV-l)-BASED RETROVIRAL VECTOR SYSTEM 
UTILIZING STABLE HIV-1 PACKAGING CELL 
LINES" 

JOURNAL OF VIROLOGY, US, THE AMERICAN 
SOCIETY FOR MICROBIOLOGY, 
vol. 68, no. 9, 

1 September 1994 (1994-09-01), pages 
6047-6051, XP002063697 

ISSN: 0022-538X 
the whole document 

HOLLER T P ET AL: "HIV1 INTEGRASE 
EXPRESSED IN ESCHERICHIA COLI FROM A 
SYNTHETIC GENE" 

GENE, NL, ELSEVIER BIOMEDICAL PRESS. 
AMSTERDAM, 

vol. 136, 22 December 1993 (1993-12-22) 
pages 323-328, XP000199775 

ISSN: 0378-1119 
the whole document 

ANDRE S ET AL: "INCREASED IMMUNE RESPONSE 
ELICITED BY DNA VACCINATION WITH A 
SYNTHETIC GP120 SEQUENCE WITH OPTIMIZED 
CODON USAGE" 

JOURNAL OF VIROLOGY, US, THE AMERICAN 
SOCIETY FOR MICROBIOLOGY, 
vol. 72, no. 2, 

1 February 1998 (1998-02-01), pages 
1497-1503, XP002073767 
ISSN: Q022-538X 
the whole document 



39 



39-43 



Fbon PCT7JSVZ10 {oor*«j«Son ct inn] *n() {iutf isez> 



pagr of 2 



